manuals/openwatcom/devel-docs/docs/intel 387 programmers reference manual.txt

   1 INTEL 80387 PROGRAMMER'S REFERENCE MANUAL 1987
   2
   3 MARCOM DISCLAIMER -- New word: Intel Certified, iRMK, SupportNET
   4 May 26, 1987
   5
   6 Intel Corporation makes no warranty for the use of its products and
   7 assumes no responsibility for any errors which may appear in this document
   8 nor does it make a commitment to update the information contained herein.
   9
  10 Intel retains the right to make changes to these specifications at any
  11 time, without notice.
  12
  13 Contact your local sales office to obtain the latest specifications before
  14 placing your order.
  15
  16 The following are trademarks of Intel Corporation and may only be used to
  17 identify Intel Products:
  18
  19 Above, BITBUS, COMMputer, CREDIT, Data Pipeline, FASTPATH, Genius, i, î,
  20 ICE, iCEL, iCS, iDBP, iDIS, I²ICE, iLBX, im, iMDDX, iMMX, Inboard,
  21 Insite, Intel, intel, intelBOS, Intel Certified, Intelevision,
  22 inteligent Identifier, inteligent Programming, Intellec, Intellink,
  23 iOSP, iPDS, iPSC, iRMK, iRMX, iSBC, iSBX, iSDM, iSXM, KEPROM, Library
  24 Manager, MAPNET, MCS, Megachassis, MICROMAINFRAME, MULTIBUS, MULTICHANNEL,
  25 MULTIMODULE, MultiSERVER, ONCE, OpenNET, OTP, PC BUBBLE, Plug-A-Bubble,
  26 PROMPT, Promware, QUEST, QueX, Quick-Pulse Programming, Ripplemode, RMX/80,
  27 RUPI, Seamless, SLD, SugarCube, SupportNET, UPI, and VLSiCEL, and the
  28 combination of ICE, iCS, iRMX, iSBC, iSBX, iSXM, MCS, or UPI and a numerical
  29 suffix, 4-SITE.
  30
  31 MDS is an ordering code only and is not used as a product name or
  32 trademark. MDS(R) is a registered trademark of Mohawk Data Sciences
  33 Corporation.
  34
  35 *MULTIBUS is a patented Intel bus.
  36 Unix is a trademark of AT&T Bell Labs.
  37 MS-DOS, XENIX, and Multiplan are trademarks of Microsoft Corporation.
  38 Lotus and 1-2-3 are registered trademarks of Lotus Development Corporation.
  39 SuperCalc is a registered trademark of Computer Associates International.
  40 Framework is a trademark of Ashton-Tate.
  41 System 370 is a trademark of IBM Corporation.
  42 AT is a registered trademark of IBM Corporation.
  43
  44 Additional copies of this manual or other Intel literature may be obtained
  45 from:
  46
  47 Intel Corporation
  48 Literature Distribution
  49 Mail Stop SC6-59
  50 3065 Bowers Avenue
  51 Santa Clara, CA 95051
  52
  53 (c)INTEL CORPORATION 1987    CG-5/26/87
  54
  55
  56 Customer Support
  57
  58 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
  59
  60 Customer Support is Intel's complete support service that provides Intel
  61 customers with hardware support, software support, customer training, and
  62 consulting services. For more information contact your local sales offices.
  63
  64 After a customer purchases any system hardware or software product,
  65 service and support become major factors in determining whether that
  66 product will continue to meet a customer's expectations. Such support
  67 requires an international support organization and a breadth of programs
  68 to meet a variety of customer needs. As you might expect, Intel's customer
  69 support is quite extensive. It includes factory repair services and
  70 worldwide field service offices providing hardware repair services,
  71 software support services, customer training classes, and consulting
  72 services.
  73
  74 Hardware Support Services
  75
  76 Intel is committed to providing an international service support package
  77 through a wide variety of service offerings available from Intel Hardware
  78 Support.
  79
  80 Software Support Services
  81
  82 Intel's software support consists of two levels of contracts. Standard
  83 support includes TIPS (Technical Information Phone Service), updates and
  84 subscription service (product-specific troubleshooting guides and COMMENTS
  85 Magazine). Basic support includes updates and the subscription service.
  86 Contracts are sold in environments which represent product groupings
  87 (i.e., iRMX environment).
  88
  89 Consulting Services
  90
  91 Intel provides field systems engineering services for any phase of your
  92 development or support effort. You can use our systems engineers in a
  93 variety of ways ranging from assistance in using a new product, developing
  94 an application, personalizing training, and customizing or tailoring an
  95 Intel product to providing technical and management consulting. Systems
  96 Engineers are well versed in technical areas such as microcommunications,
  97 real-time applications, embedded microcontrollers, and network services.
  98 You know your application needs; we know our products. Working together we
  99 can help you get a successful product to market in the least possible time.
 100
 101 Customer Training
 102
 103 Intel offers a wide range of instructional programs covering various
 104 aspects of system design and implementation. In just three to ten days a
 105 limited number of individuals learn more in a single workshop than in
 106 weeks of self-study. For optimum convenience, workshops are scheduled
 107 regularly at Training Centers woridwide or we can take our workshops to
 108 you for on-site instruction. Covering a wide variety of topics, Intel's
 109 major course categories include: architecture and assembly language,
 110 programming and operating systems, bitbus and LAN applications.
 111
 112 Training Center Locations
 113
 114 To obtain a complete catalog of our workshops, call the nearest Training
 115 Center in your area.
 116
 117 Boston                    (617) 692-1000
 118 Chicago                   (312) 310-5700
 119 San Francisco             (415) 940-7800
 120 Washington D.C.           (301) 474-2878
 121 Isreal                    (972) 349-491-099
 122 Tokyo                     03-437-6611
 123 Osaka (Call Tokyo)        03-437-6611
 124 Toronto, Canada           (416) 675-2105
 125 London                    (0793) 696-000
 126 Munich                    (089) 5389-1
 127 Paris                     (01) 687-22-21
 128 Stockholm                 (468) 734-01-00
 129 Milan                     39-2-82-44-071
 130 Benelux (Rotterdam)       (10) 21-23-77
 131 Copenhagen                (1) 198-033
 132 Hong Kong                 5-215311-7
 133
 134
 135 Preface
 136
 137 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
 138
 139 This manual describes the 80387 Numeric Processor Extension (NPX) for the
 140 80386 microprocessor. Understanding the 80387 requires an understanding of
 141 the 80386; therefore, a brief overview of 80386 concepts is presented first.
 142 A detailed discussion of the 80386 microprocessor can be found in the 80386
 143 Programmer's Reference Manual.
 144
 145 The 80386 Microsystem
 146
 147 The 80386 is the basis of a new VLSI microprocessor system with exceptional
 148 capabilities for supporting large-system applications. This powerful
 149 microsystem is designed to support multiuser reprogrammable and real-time
 150 multitasking applications. Its dedicated system support circuits simplify
 151 system hardware; sophisticated hardware and software tools reduce both the
 152 time and the cost of product development. The 80386 microsystem offers a
 153 total-solution approach, enabling you to develop high-speed, interactive,
 154 multiuser, multitasking‘‘even multiprocessor‘‘systems more rapidly and at
 155 higher performance than ever before.
 156
 157   Ž  Reliability and system up-time are becoming increasingly important in
 158      all applications. Information must be protected from misuse or
 159      accidental loss. The 80386 includes a sophisticated and flexible
 160      four-level protection mechanism that can isolate layers of operating
 161      system programs from application programs to maintain a high degree of
 162      system integrity.
 163
 164   Ž  The 80386 addresses up to 4 gigabytes of physical memory to support
 165      today's application requirements. This large physical memory enables
 166      the 80386 to keep many large programs and data structures
 167      simultaneously in memory for high-speed access.
 168
 169   Ž  For applications with dynamically changing memory requirements, such
 170      as multiuser business systems, the 80386 CPU provides on-chip memory
 171      management and virtual memory support. On an 80386-based system, each
 172      user can have up to 64 terabytes of virtual-address space. This large
 173      address space virtually eliminates restrictions on the size of programs
 174      that may be part of the system. The memory management features are
 175      subject to control of systems software; therefore, systems software
 176      designers can choose among a variety of memory-organization models.
 177      Systems designers can choose to view memory in terms of fixed-length
 178      pages, in terms of variable length segments, or as a combination of
 179      pages and segments. The sizes of segments can range from one byte to 4
 180      gigabytes. Virtual memory can be implemented either at the level of
 181      segments or at the level of pages.
 182
 183   Ž  Large multiuser or real-time multitasking systems are easily supported
 184      by the 80386. High-performance features, such as a very high-speed task
 185      switch, fast interrupt-response time, intertask protection,
 186      page-oriented virtual memory, and a quick and direct operating system
 187      interface, make the 80386 highly suited to multiuser/multitasking
 188      applications.
 189
 190   Ž  The 80386 has two primary operating modes: real-address mode and
 191      protected mode. In real-address mode, the 80386/80387 is fully upward
 192      compatible from the 8086, 8088, 80186, and 80188 microprocessors and
 193      from the 80286 real-address mode; all of the extensive libraries of
 194      8086 and 8088 software execute 15 to 20 times faster on the 80386,
 195      without any modification.
 196
 197   Ž  In protected-address mode, the advanced memory management
 198      and protection features of the 80386 become available, without any
 199      reduction in performance. Upgrading 8086 and 8088 application
 200      programs to use these new memory management and protection features
 201      usually requires only reassembly or recompilation (some programs may
 202      require minor modification). Entire 80286 protected-mode applications
 203      can run in this mode without modification.
 204
 205   Ž  The virtual-8086 mode of the 80386 is available when the primary mode
 206      is protected mode. Virtual-8086 mode enables direct execution of
 207      multiple 8086/8088 programs within a protected-mode environment. Most
 208      8086 and 8088 application programs can be executed in this environment
 209      without alteration (refer to the 80386 Programmer's Reference Manual
 210      for differences from 8086). This high degree of compatibility between
 211      80386 and earlier members of the 8086 processor family reduces both
 212      the time and the cost of software development.
 213
 214 The Organization of This Manual
 215
 216 This manual describes the 80387 Numeric Processor Extension (NPX) for the
 217 80386 microprocessor. The material in this manual is presented from the
 218 perspective of software designers, both at an applications and at a systems
 219 software level.
 220
 221   Ž  Chapter 1, "Introduction to the 80387 Numerics Processor Extension,"
 222      gives an overview of the 80387 NPX and reviews the concepts of numeric
 223      computation using the 80387.
 224
 225   Ž  Chapter 2, "80387 Numerics Processor Architecture," presents the
 226      registers and data types of the 80387 to both applications and systems
 227      programmers.
 228
 229   Ž  Chapter 3, "Special Computational Situations," discusses the special
 230      values that can be represented in the 80387's real formats‘‘denormal
 231      numbers, zeros, infinities, NaNs (not a number)‘‘as well as numerics
 232      exceptions. This chapter should be read thoroughly by systems
 233      programmers, but may be skimmed by applications programmers. Many of
 234      these special values and exceptions may never occur in applications
 235      programs.
 236
 237   Ž  Chapter 4, "80387 Instruction Set," provides functional information
 238      for software designers generating applications for systems containing
 239      an 80386 CPU with an 80387 NPX. The 80386/80387 instruction set
 240      mnemonics are explained in detail.
 241
 242   Ž  Chapter 5, "Programming Numeric Applications," provides a description
 243      of programming facilities for 80386/80387 systems. A comparative 80387
 244      programming example is given.
 245
 246   Ž  Chapter 6, "System-Level Numeric Programming," provides information of
 247      interest to systems software writers, including details of the 80387
 248      architecture and operational characteristics.
 249
 250   Ž  Chapter 7, "Numeric Programming Examples," provides several detailed
 251      programming examples for the 80387, including conditional branching,
 252      the conversion betweenfloating-point values and their ASCII
 253      representations, and the use of trigonometric functions. These examples
 254      illustrate assembly-language programming on the 80387 NPX.
 255
 256   Ž  Appendix A, "Machine Instruction Encoding and Decoding," gives
 257      reference information on the encoding of NPX instructions. This
 258      information is useful to writers of debuggers, exception handlers, and
 259      compilers.
 260
 261   Ž  Appendix B, "Exception Summary," provides a list of the exceptions
 262      that each instruction can cause. This list is valuable to both
 263      applications and systems programmers.
 264
 265   Ž  Appendix C, "Compatability between the 80387 and the 80287/8087,"
 266      describes the differences from the 80387 that are common to the 80287
 267      and the 8087.
 268
 269   Ž  Appendix D, "Compatability between the 80387 and the 8087," describes
 270      the additional differences between the 80387 and the 8087 that are of
 271      concern when porting 8086/8087 programs directly to the 80386/80387.
 272
 273   Ž  Appendix E
 274 Please consult the most recent 80387 data sheet for these specifications, "80387 80-Bit CHMOS III Numeric Processor Extension,"
 275      reproduces a data sheet of 80387 specifications that is separately
 276      available. The table of instruction timings in this appendix will be of
 277      interest to many readers of this manual. (The AC specifications have
 278      been deliberately left out.) The specifications in data sheets are
 279      subject to change; consult the most recent data sheet for design-in
 280      information.
 281
 282   Ž  Appendix F, "PC/AT-Compatible 80387 Connection," documents a
 283      nonstandard method of connecting an 80387 to an 80386 to achieve
 284      compatibility with the IBM PC/AT.
 285
 286   Ž  The Glossary defines 80387 and floating-point terminology. Refer to it
 287      as needed.
 288
 289 Related Publications
 290
 291 To best use the material in this manual, readers should be familiar with
 292 the operation and architecture of 80386 systems. The following manuals
 293 contain information related to the content of this manual and of interest to
 294 programmers of 80387 systems:
 295
 296   Ž  Introduction to the 80386, order number 231252
 297   Ž  80386 Data Sheet, order number 231630
 298   Ž  80386 Hardware Reference Manual, order number 231732
 299   Ž  80386 Programmer's Reference Manual, order number 230985
 300   Ž  80387 Data Sheet, order number 231920
 301
 302
 303 Notational Conventions
 304
 305 This manual uses special notation to represent sub and superscript
 306 characters. Subscript characters are surrounded by {curly brackets}, for
 307 example 10{2} = 10 base 2. Superscript characters are preceeded by a caret
 308 and enclosed within (parentheses), for example 10^(3) = 10 to the third
 309 power.
 310
 311
 312 Table of Contents
 313
 314 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
 315
 316 Chapter 1  Introduction to the 80387 Numerics Processor Extension
 317
 318 1.1  History
 319 1.2  Performance
 320 1.3  Ease of Use
 321 1.4  Applications
 322 1.5  Upgradability
 323 1.6  Programming Interface
 324
 325 Chapter 2  80387 Numerics Processor Architecture
 326
 327 2.1  80387 Registers
 328       2.1.1  The NPX Register Stack
 329       2.1.2  The NPX Status Word
 330       2.1.3  Control Word
 331       2.1.4  The NPX Tag Word
 332       2.1.5  The NPX Instruction and Data Pointers
 333
 334 2.2  Computation Fundamentals
 335       2.2.1  Number System
 336       2.2.2  Data Types and Formats
 337               2.2.2.1  Binary Integers
 338               2.2.2.2  Decimal Integers
 339               2.2.2.3  Real Numbers
 340
 341       2.2.3  Rounding Control
 342       2.2.4  Precision Control
 343
 344 Chapter 3  Special Computational Situations
 345
 346 3.1  Special Numeric Values
 347       3.1.1  Denormal Real Numbers
 348               3.1.1.1  Denormals and Gradual Underflow
 349
 350       3.1.2  Zeros
 351       3.1.3  Infinity
 352       3.1.4  NaN (Not-a-Number)
 353               3.1.4.1  Signaling NaNs
 354               3.1.4.2  Quiet NaNs
 355
 356       3.1.5  Indefinite
 357       3.1.6  Encoding of Data Types
 358       3.1.7  Unsupported Formats
 359
 360 3.2  Numeric Exceptions
 361       3.2.1  Handling Numeric Exceptions
 362               3.2.1.1  Automatic Exception Handling
 363               3.2.1.2  Software Exception Handling
 364
 365       3.2.2  Invalid Operation
 366               3.2.2.1  Stack Exception
 367               3.2.2.2  Invalid Arithmetic Operation
 368
 369       3.2.3  Division by Zero
 370       3.2.4  Denormal Operand
 371       3.2.5  Numeric Overflow and Underflow
 372               3.2.5.1  Overflow
 373               3.2.5.2  Underflow
 374
 375       3.2.6  Inexact (Precision)
 376       3.2.7  Exception Priority
 377       3.2.8  Standard Underflow/Overflow Exception Handler
 378
 379 Chapter 4  The 80387 Instruction Set
 380
 381 4.1  Compatibility with the 80287 and 8087
 382 4.2  Numeric Operands
 383 4.3  Data Transfer Instructions
 384       4.3.1  FLD source
 385       4.3.2  FST destination
 386       4.3.3  FSTP destination
 387       4.3.4  FXCH//destination
 388       4.3.5  FILD source
 389       4.3.6  FIST destination
 390       4.3.7  FISTP destination
 391       4.3.8  FBLD source
 392       4.3.9  FBSTP destination
 393
 394 4.4  Nontranscendental Instructions
 395       4.4.1  Addition
 396       4.4.2  Normal Subtraction
 397       4.4.3  Reversed Subtraction
 398       4.4.4  Multiplication
 399       4.4.5  Normal Division
 400       4.4.6  Reversed Division
 401       4.4.7  FSQRT
 402       4.4.8  FSCALE
 403       4.4.9  FPREM---Partial Remainder (80287/8087-Compatible)
 404       4.4.10 FPREM1---Partial Remainder (IEEE Std. 754-Compatible)
 405       4.4.11 FRNDINT
 406       4.4.12 FXTRACT
 407       4.4.13 FABS
 408       4.4.14 FCHS
 409
 410 4.5  Comparison Instructions
 411       4.5.1  FCOM//source
 412       4.5.2  FCOMP//source
 413       4.5.3  FCOMPP
 414       4.5.4  FICOM source
 415       4.5.5  FICOMP source
 416       4.5.6  FTST
 417       4.5.7  FUCOM//source
 418       4.5.8  FUCOMP//source
 419       4.5.9  FUCOMPP
 420       4.5.10 FXAM
 421
 422 4.6  Transcendental Instructions
 423       4.6.1  FCOS
 424       4.6.2  FSIN
 425       4.6.3  FSINCOS
 426       4.6.4  FPTAN
 427       4.6.5  FPATAN
 428       4.6.6  F2XM1
 429       4.6.7  FYL2X
 430       4.6.8  FYL2XP1
 431
 432 4.7  Constant Instructions
 433       4.7.1  FLDZ
 434       4.7.2  FLD1
 435       4.7.3  FLDPI
 436       4.7.4  FLDL2T
 437       4.7.5  FLDL2E
 438       4.7.6  FLDLG2
 439       4.7.7  FLDLN2
 440
 441 4.8  Processor Control Instructions
 442       4.8.1  FINIT/FNINIT
 443       4.8.2  FLDCW source
 444       4.8.3  FSTCW/FNSTCW destination
 445       4.8.4  FSTSW/FNSTSW destination
 446       4.8.5  FSTSW AX/FNSTSW AX
 447       4.8.6  FCLEX/FNCLEX
 448       4.8.7  FSAVE/FNSAVE destination
 449       4.8.8  FRSTOR source
 450       4.8.9  FSTENV/FNSTENV destination
 451       4.8.10 FLDENV source
 452       4.8.11 FINCSTP
 453       4.8.12 FDECSTP
 454       4.8.13 FFREE destination
 455       4.8.14 FNOP
 456       4.8.15 FWAIT (CPU Instruction)
 457
 458 Chapter 5  Programming Numeric Applications
 459
 460 5.1  Programming Facilities
 461       5.1.1  High-Level Languages
 462       5.1.2  C Programs
 463       5.1.3  PL/M-386
 464       5.1.4  ASM386
 465               5.1.4.1  Defining Data
 466               5.1.4.2  Records and Structures
 467               5.1.4.3  Addressing Methods
 468
 469       5.1.5  Comparative Programming Example
 470       5.1.6  80387 Emulation
 471
 472 5.2  Concurrent Processing with the 80387
 473       5.2.1  Managing Concurrency
 474               5.2.1.1  Incorrect Exception Synchronization
 475               5.2.1.2  Proper Exception Synchronization
 476
 477 Chapter 6  System-Level Numeric Programming
 478
 479 6.1  80386/80387 Architecture
 480       6.1.1  Instruction and Operand Transfer
 481       6.1.2  Independent of CPU Addressing Modes
 482       6.1.3  Dedicated I/O Locations
 483
 484 6.2  Processor Initialization and Control
 485       6.2.1  System Initialization
 486       6.2.2  Hardware Recognition of the NPX
 487       6.2.3  Software Recognition of the NPX
 488       6.2.4  Configuring the Numerics Environment
 489       6.2.5  Initializing the 80387
 490       6.2.6  80387 Emulation
 491       6.2.7  Handling Numerics Exceptions
 492       6.2.8  Simultaneous Exception Response
 493       6.2.9  Exception Recovery Examples
 494
 495 Chapter 7  Numeric Programming Examples
 496
 497 7.1  Conditional Branching Example
 498 7.2  Exception Handling Examples
 499 7.3  Floating-Point to ASCII Conversion Examples
 500       7.3.1  Function Partitioning
 501       7.3.2  Exception Considerations
 502       7.3.3  Special Instructions
 503       7.3.4  Description of Operation
 504       7.3.5  Scaling the Value
 505               7.3.5.1  Inaccuracy in Scaling
 506               7.3.5.2  Avoiding Underflow and Overflow
 507               7.3.5.3  Final Adjustments
 508
 509       7.3.6  Output Format
 510
 511 7.4  Trigonometric Calculation Examples (Not Tested)
 512
 513 Appendix A  Machine Instruction Encoding and Decoding
 514
 515 Appendix B  Exception Summary
 516
 517 Appendix C  Compatibility Between the 80387 and the 80287/8087
 518
 519 Appendix D  Compatibility Between the 80387 and the 8087
 520
 521 Appendix E  80387 80-Bit CHMOS III Numeric Processor Extension
 522
 523 Appendix F  PC/AT-Compatible 80387 Connection
 524
 525 Glossary of 80387 and Floating-Point Terminology
 526
 527
 528 Figures
 529
 530 1-1     Evolution and Performance of Numeric Processors
 531
 532 2-1     80387 Register Set
 533 2-2     80387 Status Word
 534 2-3     80387 Control Word Format
 535 2-4     80387 Tag Word Format
 536 2-5     Protected Mode 80387 Instruction and Data Pointer Image in Memory,
 537             32-Bit Format
 538 2-6     Real Mode 80387 Instruction and Data Pointer Image in Memory,
 539             32-Bit Format
 540 2-7     Protected Mode 80387 Instruction and Data Pointer Image in Memory,
 541             16-Bit Format
 542 2-8     Real Mode 80387 Instruction and Data Pointer Image in Memory,
 543             16-Bit Format
 544 2-9     80387 Double-Precision Number System
 545 2-10    80387 Data Formats
 546
 547 3-1     Floating-Point System with Denormals
 548 3-2     Floating-Point System without Denormals
 549 3-3     Arithmetic Example Using Infinity
 550
 551 4-1     FSAVE/FRSTOR Memory Layout (32-Bit)
 552 4-2     FSAVE/FRSTOR Memory Layout (16-Bit)
 553 4-3     Protected Mode 80387 Environment, 32-Bit Format
 554 4-4     Real Mode 80387 Environment, 32-Bit Format
 555 4-5     Protected Mode 80387 Environment, 16-Bit Format
 556 4-6     Real Mode 80387 Environment, 16-Bit Format
 557
 558 5-1     Sample C-386 Program
 559 5-2     Sample 80387 Constants
 560 5-3     Status Word Record Definition
 561 5-4     Structure Definition
 562 5-5     Sample PL/M-386 Program
 563 5-6     Sample ASM386 Program
 564 5-7     Instructions and Register Stack
 565 5-8     Exception Synchronization Examples
 566
 567 6-1     Software Routine to Recognize the 80287
 568
 569 7-1     Conditional Branching for Compares
 570 7-2     Conditional Branching for FXAM
 571 7-3     Full-State Exception Handler
 572 7-4     Reduced-Latency Exception Handler
 573 7-5     Reentrant Exception Handler
 574 7-6     Floating-Point to ASCII Conversion Routine
 575 7-7
 576 See page 7-22 in the printed version of this manual     Relationships between Adjacent Joints
 577 7-8     Robot Arm Kinematics Example
 578
 579
 580 Tables
 581
 582 1-1     Numeric Processing Speed Comparisons
 583 1-2     Numeric Data Types
 584 1-3     Principal NPX Instructions
 585
 586 2-1     Condition Code Interpretation
 587 2-2     Correspondence between 80387 and 80386 Flag Bits
 588 2-3     Summary of Format Parameters
 589 2-4     Real Number Notation
 590 2-5     Rounding Modes
 591
 592 3-1     Arithmetic and Nonarithmetic Instructions
 593 3-2     Denormalization Process
 594 3-3     Zero Operands and Results
 595 3-4     Infinity Operands and Results
 596 3-5     Rules for Generating QNaNs
 597 3-6     Binary Integer Encodings
 598 3-7     Packed Decimal Encodings
 599 3-8     Single and Double Real Encodings
 600 3-9     Extended Real Encodings
 601 3-10    Masked Responses to Invalid Operations
 602 3-11    Masked Overflow Results
 603
 604 4-1     Data Transfer Instructions
 605 4-2     Nontranscendental Instructions
 606 4-3     Basic Nontranscendental Instructions and Operands
 607 4-4     Condition Code Interpretation after FPREM and FPREM
 608             Instructions
 609 4-5     Comparison Instructions
 610 4-6     Condition Code Resulting from Comparisons
 611 4-7     Condition Code Resulting from FTST
 612 4-8     Condition Code Defining Operand Class
 613 4-9     Transcendental Instructions
 614 4-10    Results of FPATAN
 615 4-11    Constant Instructions
 616 4-12    Processor Control Instructions
 617
 618 5-1     PL/M-386 Built-In Procedures
 619 5-2     ASM386 Storage Allocation Directives
 620 5-3     Addressing Method Examples
 621
 622 6-1     NPX Processor State Following Initialization
 623
 624
 625 Chapter 1  Introduction to the 80387 Numerics Processor Extension
 626
 627 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
 628
 629 The 80387 NPX is a high-performance numerics processing element that
 630 extends the 80386 architecture by adding significant numeric capabilities
 631 and direct support for floating-point, extended-integer, and BCD data types.
 632 The 80386 CPU with 80387 NPX easily supports powerful and accurate numeric
 633 applications through its implementation of the IEEE Standard 754 for Binary
 634 Floating-Point Arithmetic. The 80387 provides floating-point performance
 635 comparable to that of large minicomputers while offering compatibility with
 636 object code for 8087 and 80287.
 637
 638
 639 1.1  History
 640
 641 The 80387 Numeric Processor Extension (NPX) is compatible with its
 642 predecessors, the earlier Intel 8087 NPX and 80287 NPX. As the 80386 runs
 643 8086 programs, so programs designed to use the 8087 and 80287 should run
 644 unchanged on the 80387.
 645
 646 The 8087 NPX was designed for use in 8086-family systems. The 8086 was the
 647 first microprocessor family to partition the processing unit to permit
 648 high-performance numeric capabilities. The 8087 NPX for this processor
 649 family implemented a complete numeric processing environment in compliance
 650 with an early proposal for the IEEE 754 Floating-Point Standard.
 651
 652 With the 80287 Numeric Processor Extension, high-speed numeric computations
 653 were extended to 80286 high-performance multitasking and multiuser systems.
 654 Multiple tasks using the numeric processor extension were afforded the full
 655 protection of the 80286 memory management and protection features.
 656
 657 The 80387 Numeric Processor Extension is Intel's third generation numerics
 658 processor. The 80387 implements the final IEEE standard, adds new
 659 trigonometric instructions, and uses a new design and CHMOS-III process to
 660 allow higher clock rates and require fewer clocks per instruction. Together,
 661 the 80387 with additional instructions and the improved standard bring even
 662 more convenience and reliability to numerics programming and make this
 663 convenience and reliability available to applications that need the
 664 high-speed and large memory capacity of the 32-bit environment of the 80386
 665 CPU.
 666
 667 Figure 1-1 illustrates the relative performance of 5-MHz 8086/8087,
 668 8-MHz 80286/80287, and 20-MHz 80386/80387 systems in executing
 669 numerics-oriented applications.
 670
 671
 672 Figure 1-1.  Evolution and Performance of Numeric Processors
 673
 674                   16�                       80386/80387 (20 MHz)
 675                   15�
 676                   14�
 677                   13�
 678                   12�
 679                   11�
 680      RELATIVE     10�
 681      PERFORMANCE   9�
 682                    8�
 683                    7�
 684                    6�
 685                    5�
 686                    4�
 687                    3�     80286/80287 (8 MHz)
 688                    2�
 689                    1�  8086/8087 (5 MHz)
 690                     ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
 691                          1980        1983        1987
 692
 693                                  YEAR INTRODUCED
 694
 695
 696 1.2  Performance
 697
 698 Table 1-1 compares the execution times of several 80387 instructions with
 699 the equivalent operations executed on an 8-MHz 80287. As indicated in the
 700 table, the 16-MHz 80387 NPX provides about 5 to 6 times the performance of
 701 an 8-MHz 80287 NPX. A 16-MHz 80387 multiplies 32-bit and 64-bit
 702 floating-point numbers in about 1.9 and 2.8 microseconds, respectively. Of
 703 course, the actual performance of the NPX in a given system depends on the
 704 characteristics of the individual application.
 705
 706 Although the performance figures shown in Table 1-1 refer to operations on
 707 real (floating-point) numbers, the 80387 also manipulates fixed-point
 708 binary and decimal integers of up to 64 bits or 18 digits, respectively. The
 709 80387 can improve the speed of multiple-precision software algorithms for
 710 integer operations by 10 to 100 times.
 711
 712 Because the 80387 NPX is an extension of the 80386 CPU, no software
 713 overhead is incurred in setting up the NPX for computation. The 80387 and
 714 80386 processors coordinate their activities in a manner transparent to
 715 software. Moreover, built-in coordination facilities allow the 80386 CPU to
 716 proceed with other instructions while the 80387 NPX is simultaneously
 717 executing numeric instructions. Programs can exploit this concurrency of
 718 execution to further increase system performance and throughput.
 719
 720
 721 Table 1-1.  Numeric Processing Speed Comparisons
 722
 723                                            Approximate Performance Ratios:
 724        Floating-Point Instruction                16 MHz 80386/80387 ÷
 725  ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“         8 MHz 80286/80287
 726
 727 FADD      ST, ST(i)                Addition             6.2
 728 FDIV      dword_var                Division             4.7
 729 FYL2X     stack (0), (1) assumed   Logarithm            6.0
 730 FPATAX    stack (0) assumed        Arctangent           2.6
 731 The ratio is higher if the operand is not in range of the 80287
 732 instruction.
 733 F2XM1     stack (0) assumed        Exponentiation       2.7
 734 The ratio is higher if the operand is not in range of the 80287
 735 instruction.
 736
 737
 738 1.3  East of Use
 739
 740 The 80387 NPX offers more than raw execution speed for
 741 computation-intensive tasks. The 80387 brings the functionality and power of
 742 accurate numeric computation into the hands of the general user. These
 743 features are available in most high-level languages available for the 80386.
 744
 745 Like the 8087 and 80287 that preceded it, the 80387 is explicitly designed
 746 to deliver stable, accurate results when programmed using straightforward
 747 "pencil and paper" algorithms. The IEEE standard 754 specifically addresses
 748 this issue, recognizing the fundamental importance of making numeric
 749 computations both easy and safe to use.
 750
 751 For example, most computers can overflow when two single-precision
 752 floating-point numbers are multiplied together and then divided by a third,
 753 even if the final result is a perfectly valid 32-bit number. The 80387
 754 delivers the correctly rounded result. Other typical examples of undesirable
 755 machine behavior in straightforward calculations occur when computing
 756 financial rate of return, which involves the expression (1 + i)^(n) or when
 757 solving for roots of a quadratic equation:
 758
 759        -b ± ¹(b² - 4ac)
 760        ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
 761               2a
 762
 763 If a does not equal 0, the formula is numerically unstable when the roots
 764 are nearly coincident or when their magnitudes are wildly different. The
 765 formula is also vulnerable to spurious over/underflows when the coefficients
 766 a, b, and c are all very big or all very tiny. When single-precision
 767 (4-byte) floating-point coefficients are given as data and the formula is
 768 evaluated in the 80387's normal way, keeping all intermediate results in
 769 its stack, the 80387 produces impeccable single-precision roots. This
 770 happens because, by default and with no effort on the programmer's part, the
 771 80387 evaluates all those subexpressions with so much extra precision and
 772 range as to overwhelm any threat to numerical integrity.
 773
 774 If double-precision data and results were at issue, a better formula would
 775 have to be used, and once again the 80387's default evaluation of that
 776 formula would provide substantially enhanced numerical integrity over mere
 777 double-precision evaluation.
 778
 779 On most machines, straightforward algorithms will not deliver consistently
 780 correct results (and will not indicate when they are incorrect). To obtain
 781 correct results on traditional machines under all conditions usually
 782 requires sophisticated numerical techniques that are foreign to most
 783 programmers. General application programmers using straightforward
 784 algorithms will produce much more reliable programs using the 80387. This
 785 simple fact greatly reduces the software investment required to develop
 786 safe, accurate computation-based products.
 787
 788 Beyond traditional numerics support for scientific applications, the 80387
 789 has built-in facilities for commercial computing. It can process decimal
 790 numbers of up to 18 digits without round-off errors, performing exact
 791 arithmetic on integers as large as 2^(64) or 10^(18). Exact arithmetic is
 792 vital in accounting applications where rounding errors may introduce
 793 monetary losses that cannot be reconciled.
 794
 795 The NPX contains a number of optional facilities that can be invoked by
 796 sophisticated users. These advanced features include directed rounding,
 797 gradual underflow, and programmed exception-handling facilities.
 798
 799 These automatic exception-handling facilities permit a high degree of
 800 flexibility in numeric processing software, without burdening the
 801 programmer. While performing numeric calculations, the NPX automatically
 802 detects exception conditions that can potentially damage a calculation (for
 803 example, X ÷ 0 or ¹X when X < 0). By default, on-chip exception logic
 804 handles these exceptions so that a reasonable result is produced and
 805 execution may proceed without program interruption. Alternatively, the NPX
 806 can signal the CPU, invoking a software exception handler to provide special
 807 results whenever various types of exceptions are detected.
 808
 809
 810 1.4  Applications
 811
 812 The 80386's versatility and performance make it appropriate to a broad
 813 array of numeric applications. In general, applications that exhibit any of
 814 the following characteristics can benefit by implementing numeric processing
 815 on the 80387:
 816
 817   Ž  Numeric data vary over a wide range of values, or include nonintegral
 818      values.
 819
 820   Ž  Algorithms produce very large or very small intermediate results.
 821
 822   Ž  Computations must be very precise; i.e., a large number of significant
 823      digits must be maintained.
 824
 825   Ž  Performance requirements exceed the capacity of traditional
 826      microprocessors.
 827
 828   Ž  Consistently safe, reliable results must be delivered using a
 829      programming staff that is not expert in numerical techniques.
 830
 831 Note also that the 80387 can reduce software development costs and improve
 832 the performance of systems that use not only real numbers, but operate on
 833 multiprecision binary or decimal integer values as well.
 834
 835 A few examples, which show how the 80387 might be used in specific numerics
 836 applications, are described below. In many cases, these types of systems
 837 have been implemented in the past with minicomputers or small mainframe
 838 computers. The advent of the 80387 brings the size and cost savings of
 839 microprocessor technology to these applications for the first time.
 840
 841   Ž  Business data processing‘‘The NPX's ability to accept decimal operands
 842      and produce exact decimal results of up to 18 digits greatly simplifies
 843      accounting programming. Financial calculations that use power functions
 844      can take advantage of the 80387's exponentiation and logarithmic
 845      instructions. Many business software packages can benefit from the
 846      speed and accuracy of the 80387; for example, Lotus* 1-2-3*,
 847      Multiplan*, SuperCalc*, and Framework*.
 848
 849   Ž  Simulation‘‘The large (32-bit) memory space of the 80386 coupled with
 850      the raw speed of the 80386 and 80387 processors make 80386/80387
 851      microsystems suitable for attacking large simulation problems, which
 852      heretofore could only be executed on expensive mini and mainframe
 853      computers. For example, complex electronic circuit simulations using
 854      SPICE can now be performed on a microcomputer, the 80386/80387.
 855      Simulation of mechanical systems using finite element analysis can
 856      employ more elements, resulting in more detailed analysis or simulation
 857      of larger systems.
 858
 859   Ž  Graphics transformations‘‘The 80387 can be used in graphics terminals
 860      to locally perform many functions that normally demand the attention of
 861      a main computer; these include rotation, scaling, and interpolation. By
 862      also using an 82786 Graphics Display Controller to perform high-speed
 863      drawing and window management, very powerful and highly self-sufficient
 864      terminals can be built from a relatively small number of 80386 family
 865      parts.
 866
 867   Ž  Process control‘‘The 80387 solves dynamic range problems
 868      automatically, and its extended precision allows control functions to
 869      be fine-tuned for more accurate and efficient performance. Control
 870      algorithms implemented with the NPX also contribute to improved
 871      reliability and safety, while the 80387's speed can be exploited in
 872      real-time operations.
 873
 874   Ž  Computer numerical control (CNC)‘‘The 80387 can move and position
 875      machine tool heads with accuracy in real-time. Axis positioning also
 876      benefits from the hardware trigonometric support provided by the 80387.
 877
 878   Ž  Robotics‘‘Coupling small size and modest power requirements with
 879      powerful computational abilities, the 80387 is ideal for on-board
 880      six-axis positioning.
 881
 882   Ž  Navigation‘‘Very small, lightweight, and accurate inertial guidance
 883      systems can be implemented with the 80387. Its built-in trigonometric
 884      functions can speed and simplify the calculation of position from
 885      bearing data.
 886
 887   Ž  Data acquisition‘‘The 80387 can be used to scan, scale, and reduce
 888      large quantities of data as it is collected, thereby lowering storage
 889      requirements and time required to process the data for analysis.
 890
 891 The preceding examples are oriented toward traditional numerics
 892 applications. There are, in addition, many other types of systems that do
 893 not appear to the end user as computational, but can employ the 80387 to
 894 advantage. Indeed, the 80387 presents the imaginative system designer with
 895 an opportunity similar to that created by the introduction of the
 896 microprocessor itself. Many applications can be viewed as numerically-based
 897 if sufficient computational power is available to support this view (e.g.,
 898 character generation for a laser printer). This is analogous to the
 899 thousands of successful products that have been built around "buried"
 900 microprocessors, even though the products themselves bear little
 901 resemblance to computers.
 902
 903
 904 1.5  Upgradability
 905
 906 The architecture of the 80386 CPU is specifically adapted to allow easy
 907 upgradability to use an 80387, simply by plugging in the 80387 NPX. For this
 908 reason, designers of 80386 systems may wish to incorporate the 80387 NPX
 909 into their designs in order to offer two levels of price and performance at
 910 little additional cost.
 911
 912 Two features of the 80386 CPU make the design and support of upgradable
 913 80386 systems particularly simple:
 914
 915   Ž  The 80386 can be programmed to recognize the presence of an 80387 NPX;
 916      that is, software can recognize whether it is running on an 80386 with
 917      or without an 80387 NPX.
 918
 919   Ž  After determining whether the 80387 NPX is available, the 80386 CPU
 920      can be instructed to let the NPX execute all numeric instructions. If
 921      an 80387 NPX is not available, the 80386 CPU can emulate all 80387
 922      numeric instructions in software. This emulation is completely
 923      transparent to the application software‘‘the same object code may be
 924      used by 80386 systems both with and without an 80387 NPX. No relinking
 925      or recompiling of application software is necessary; the same code will
 926      simply execute faster with the 80387 NPX than without.
 927
 928 To facilitate this design of upgradable 80386 systems, Intel provides a
 929 software emulator for the 80387 that provides the functional equivalent of
 930 the 80387 hardware, implemented in software on the 80386. Except for timing,
 931 the operation of this 80387 emulator (EMUL387) is the same as for the 80387
 932 NPX hardware. When the emulator is combined as part of the systems software,
 933 the 80386 system with 80387 emulation and the 80386 with 80387 hardware are
 934 virtually indistinguishable to an application program. This capability
 935 makes it easy for software developers to maintain a single set of programs
 936 for both systems. System manufacturers can offer the NPX as a simple plug-in
 937 performance option without necessitating any changes in the user's software.
 938
 939
 940 1.6  Programming Interface
 941
 942 The 80386/80387 pair is programmed as a single processor; all of the 80387
 943 registers appear to a programmer as extensions of the basic 80386 register
 944 set. The 80386 has a class of instructions known as ESCAPE instructions, all
 945 having a common format. These ESC instructions are numeric instructions for
 946 the 80387 NPX. These numeric instructions for the 80387 are simply encoded
 947 into the instruction stream along with 80386 instructions.
 948
 949 All of the CPU memory-addressing modes may be used in programming the NPX,
 950 allowing convenient access to record structures, numeric arrays, and other
 951 memory-based data structures. All of the memory management and protection
 952 features of the CPU (both paging and segmentation) are extended to the NPX
 953 as well.
 954
 955 Numeric processing in the 80387 centers around the NPX register stack.
 956 Programmers can treat these eight 80-bit registers either as a fixed
 957 register set, with instructions operating on explicitly-designated
 958 registers, or as a classical stack, with instructions operating on the top
 959 one or two stack elements.
 960
 961 Internally, the 80387 holds all numbers in a uniform 80-bit extended
 962 format. Operands that may be represented in memory as 16-, 32-, or 64-bit
 963 integers, 32-, 64-, or 80-bit floating-point numbers, or 18-digit packed BCD
 964 numbers, are automatically converted into extended format as they are loaded
 965 into the NPX registers. Computation results are subsequently converted back
 966 into one of these destination data formats when they are stored into memory
 967 from the NPX registers.
 968
 969 Table 1-2 lists each of the seven data types supported by the 80387,
 970 showing the data format for each type. All operands are stored in memory
 971 with the least significant digits starting at the initial (lowest) memory
 972 address. Numeric instructions access and store memory operands using only
 973 this initial address. For maximum system performance, all operands should
 974 start at memory addresses divisible by four.
 975
 976 Table 1-3 lists the 80387 instructions by class. No special programming
 977 tools are necessary to use the 80387, because all of the NPX instructions
 978 and data types are directly supported by the ASM386 Assembler, by high-level
 979 languages from Intel, and by assemblers and compilers produced by many
 980 independent software vendors. Software routines for the 80387 may be written
 981 in ASM386 Assembler or any of the following higher-level languages from
 982 Intel:
 983
 984       PL/M-386
 985       C-386
 986
 987 In addition, all of the development tools supporting the 8086/8087 and
 988 80286/80287 can also be used to develop software for the 80386/80387.
 989
 990 All of these high-level languages provide programmers with access to the
 991 computational power and speed of the 80387 without requiring an
 992 understanding of the architecture of the 80386 and 80387 chips. Such
 993 architectural considerations as concurrency and synchronization are handled
 994 automatically by these high-level languages. For the ASM386 programmer,
 995 specific rules for handling these issues are discussed in a later section
 996 of this manual.
 997
 998 The following operating systems are known or expected to support the
 999 80387: RMX-286/386, MS-DOS, Xenix-286/386, and Unix-286/386. Advanced
1000 in-circuit debugging support is provided by ICE-386.
1001
1002
1003 Table 1-2.  Numeric Data Types
1004
1005 Data Type       Bits  Significant  Approximate Range (Decimal)
1006                       Digits
1007                       (Decimal)
1008
1009 Word integer    16    4            -32,768 ¾ X ¾ +32,767
1010 Short integer   32    9            -2*10^(9) ¾ X ¾ +2*10^(9)
1011 Long integer    64    18           -9*10^(18) ¾ X ¾ +9*10^(18)
1012 Packed decimal  80    18           -99...99 ¾ X ¾ +99...99 (18 digits)
1013 Single real     32    6-7          1.18*10^(-38) ¾ �X� ¾ 3.40*10^(38)
1014 Double real     64    15-16        2.23*10^(-308) ¾ �X� ¾ 1.80*10^(308)
1015 Extended real
1016 Equivalent to double extended format of IEEE Std 754  80    19           3.30*10^(-4932) ¾ �X� ¾ 1.21*10^(4932)
1017
1018
1019 Table 1-3.  Principal NPX Instructions
1020
1021 Class                Instruction Types
1022
1023 Data Transfer        Load (all data types), Store (all data types), Exchange
1024
1025 Arithmetic           Add, Subtract, Multiply, Divide, Subtract Reversed,
1026                      Divide Reversed, Square Root, Scale, Remainder, Integer
1027                      Part, Change Sign, Absolute Value, Extract
1028
1029 Comparison           Compare, Examine, Test
1030
1031 Transcendental       Tangent, Arctangent, Sine, Cosine, Sine and Cosine,
1032                      2^(x) - 1, Y * Log{2}(X), Y * Log{2}(X+1)
1033
1034 Constants            0, 1, Ò, Log{10}2, Log{e}2, Log{2}10, Log{2}e
1035
1036 Processor Control    Load Control Word, Store Control Word, Store Status
1037                      Word, Load Environment, Store Environment, Save,
1038                      Restore, Clear Exceptions, Initialize
1039
1040
1041
1042 Class                Instruction Types
1043
1044 Data Transfer        Load (all data types), Store (all data types), Exchange
1045
1046 Arithmetic           Add, Subtract, Multiply, Divide, Subtract Reversed,
1047                      Divide Reversed, Square Root, Scale, Remainder, Integer
1048                      Part, Change Sign, Absolute Value, Extract
1049
1050 Comparison           Compare, Examine, Test
1051
1052 Transcendental       Tangent, Arctangent, Sine, Cosine, Sine and Cosine,
1053                      2^(x) - 1, Y * Log{2}(X), Y * Log{2}(X+1)
1054
1055 Constants            0, 1, Ò, Log{10}2, Log{e}2, Log{2}10, Log{2}e
1056
1057 Processor Control    Load Control Word, Store Control Word, Store Status
1058                      Word, Load Environment, Store Environment, Save,
1059                      Restore, Clear Exceptions, Initialize
1060
1061
1062 Chapter 2  80387 Numerics Processor Architecture
1063
1064 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1065
1066 To the programmer, the 80387 NPX appears as a set of additional registers,
1067 data types, and instructions‘‘all of which complement those of the 80386.
1068 Refer to Chapter 4 for detailed explanations of the 80387 instruction set.
1069 This chapter explains the new registers and data types that the 80387 brings
1070 to the architecture of the 80386.
1071
1072
1073 2.1  80387 Registers
1074
1075 The additional registers consist of
1076
1077   Ž  Eight individually-addressable 80-bit numeric registers, organized as
1078      a register stack
1079
1080   Ž  Three sixteen-bit registers containing:
1081
1082      the NPX status word
1083      the NPX control word
1084      the tag word
1085
1086   Ž  Two 48-bit registers containing pointers to the current instruction
1087      and operand (these registers are actually located in the 80386)
1088
1089 All of the NPX numeric instructions focus on the contents of these NPX
1090 registers.
1091
1092
1093 2.1.1  The NPX Register Stack
1094
1095 The 80387 register stack is shown in Figure 2-1. Each of the eight numeric
1096 registers in the 80387's register stack is 80 bits wide and is divided into
1097 fields corresponding to the NPX's extended real data type.
1098
1099 Numeric instructions address the data registers relative to the register on
1100 the top of the stack. At any point in time, this top-of-stack register is
1101 indicated by the TOP (stack TOP) field in the NPX status word. Load or push
1102 operations decrement TOP by one and load a value into the new top register.
1103 A store-and-pop operation stores the value from the current TOP register and
1104 then increments TOP by one. Like 80386 stacks in memory, the 80387 register
1105 stack grows down toward lower-addressed registers.
1106
1107 Many numeric instructions have several addressing modes that permit the
1108 programmer to implicitly operate on the top of the stack, or to explicitly
1109 operate on specific registers relative to the TOP. The ASM386 Assembler
1110 supports these register addressing modes, using the expression ST(0), or
1111 simply ST, to represent the current Stack Top and ST(i) to specify the ith
1112 register from TOP in the stack (0 ¾ i ¾ 7). For example, if TOP contains
1113 011B (register 3 is the top of the stack), the following statement would add
1114 the contents of two registers in the stack (registers 3 and 5):
1115
1116 FADD   ST, ST(2)
1117
1118 The stack organization and top-relative addressing of the numeric registers
1119 simplify subroutine programming by allowing routines to pass parameters on
1120 the register stack. By using the stack to pass parameters rather than using
1121 "dedicated" registers, calling routines gain more flexibility in how they
1122 use the stack. As long as the stack is not full, each routine simply loads
1123 the parameters onto the stack before calling a particular subroutine to
1124 perform a numeric calculation. The subroutine then addresses its parameters
1125 as ST, ST(1), etc., even though TOP may, for example, refer to physical
1126 register 3 in one invocation and physical register 5 in another.
1127
1128
1129 Figure 2-1.  80387 Register Set
1130
1131                             80387 DATA REGISTERS                 TAG
1132                                                                 FIELD
1133              79 78    64 63                                 0    1 0
1134           ‚����Ð��������Ð������������������������������������ƒ  ‚���ƒ
1135         R0€SIGN�EXPONENT�             SIGNIFICAND            €  €   €
1136         R1Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â  Ã‘‘‘Â
1137         R2Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â  Ã‘‘‘Â
1138         R3Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â  Ã‘‘‘Â
1139         R4Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â  Ã‘‘‘Â
1140         R5Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â  Ã‘‘‘Â
1141         R6Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â  Ã‘‘‘Â
1142         R7Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â  Ã‘‘‘Â
1143           „����¤��������¤������������������������������������…  „���…
1144
1145            15                0   47                                0
1146           ‚�������������������ƒ ‚�����������������������������������ƒ
1147           € CONTROL REGISTER  € €        INSTRUCTION POINTER        €
1148           Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1149           €  STATUS REGISTER  € €            DATA POINTER           €
1150           Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â „�����������������������������������…
1151           €     TAG WORD      €
1152           „�������������������…
1153
1154
1155 2.1.2  The NPX Status Word
1156
1157 The 16-bit status word shown in Figure 2-2 reflects the overall state of
1158 the 80387. This status word may be stored into memory using the
1159 FSTSW/FNSTSW, FSTENV/FNSTENV, and FSAVE/FNSAVE instructions, and can be
1160 transferred into the 80386 AX register with the FSTSW AX/FNSTSW AX
1161 instructions, allowing the NPX status to be inspected by the CPU.
1162
1163 The B-bit (bit 15) is included for 8087 compatibility only. It reflects the
1164 contents of the ES bit (bit 7 of the status word), not the status of the
1165 BUSY# output of the 80387.
1166
1167 The four NPX condition code bits (C{3}-C{0}) are similar to the flags in a
1168 CPU: the 80387 updates these bits to reflect the outcome of arithmetic
1169 operations. The effect of these instructions on the condition code bits is
1170 summarized in Table 2-1. These condition code bits are used principally for
1171 conditional branching. The FSTSW AX instruction stores the NPX status word
1172 directly into the CPU AX register, allowing these condition codes to be
1173 inspected efficiently by 80386 code. The 80386 SAHF instruction can copy
1174 C{3}-C{0} directly to 80386 flag bits to simplify conditional branching.
1175 Table 2-2 shows the mapping of these bits to the 80386 flag bits.
1176
1177 Bits 12-14 of the status word point to the 80387 register that is the
1178 current Top of Stack (TOP). The significance of the stack top has been
1179 described in the prior section on the register stack.
1180
1181 Figure 2-2 shows the six exception flags in bits 0-5 of the status word.
1182 Bit 7 is the exception summary status (ES) bit. ES is set if any unmasked
1183 exception bits are set, and is cleared otherwise. If this bit is set, the
1184 ERROR# signal is asserted. Bits 0-5 indicate whether the NPX has detected
1185 one of six possible exception conditions since these status bits were last
1186 cleared or reset. They are "sticky" bits, and can only be cleared by the
1187 instructions FINIT, FCLEX, FLDENV, FSAVE, and FRSTOR.
1188
1189 Bit 6 is the stack fault (SF) bit. This bit distinguishes invalid
1190 operations due to stack overflow or underflow from other kinds of invalid
1191 operations. When SF is set, bit 9 (C{1}) distinguishes between stack
1192 overflow (C{1} = 1) and underflow (C{1} = 0).
1193
1194
1195 Figure 2-2.  80387 Status Word
1196
1197          ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 80387 BUSY
1198          �       ’‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ TOP OF STACK POINTER
1199          �   ’‘‘‘�‘‘‘�‘‘‘�‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ CONDITION CODE
1200          \x1f   \x1f   \x1f   \x1f   \x1f   \x1f   \x1f   \x1f
1201         15                               7                            0
1202        ‚���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���ƒ
1203        € B � C �    TOP    � C � C � C � E � S � P � U � O � Z � D � I €
1204        €   � 3 �   �   �   � 2 � 1 � 0 � S � F � E � E � E � E � E � E €
1205        „���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���…
1206                                          \x1e   \x1e   \x1e   \x1e   \x1e   \x1e   \x1e   \x1e
1207        ERROR SUMMARY STATUS ‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �   �   �   �   �   �
1208        STACK FAULT ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �   �   �   �   �
1209        EXCEPTION FLAGS                           �   �   �   �   �   �
1210          PRECISION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �   �   �   �
1211          UNDERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �   �   �
1212          OVERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �   �
1213          ZERO DIVIDE ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �
1214          DENORMALIZED OPERAND ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �
1215          INVALID OPERATION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•
1216
1217 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1218 NOTE:
1219        ES IS SET IF ANY UNMASKED EXCEPTION BIT IS SET; CLEARED OTHERWISE.
1220        SEE TABLE 2-1 FOR INTERPRETATION OF CONDITION CODE.
1221        TOP VALUES:
1222            000 = REGISTER 0 IS TOP OF STACK
1223            001 = REGISTER 1 IS TOP OF STACK
1224                           .
1225                           .
1226                           .
1227            111 = REGISTER 7 IS TOP OF STACK
1228        FOR DEFINITIONS OF EXCEPTIONS, REFER TO CHAPTER 3.
1229 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1230
1231
1232 Table 2-1.  Condition Code Interpretation
1233
1234
1235 Instruction         C0 (S)     C3 (Z)      C1 (A)      C2 (C)
1236
1237 FPREM, FPREM1       Three least significant bits       Reduction
1238                             of quotient
1239
1240                     Q2         Q0          Q1          0=complete
1241                                            or O/U#     1=incomplete
1242
1243 FCOM, FCOMP,
1244 FCOMPP, FTST,       Result of comparison   Zero        Operand is not
1245 FUCOM, FUCOMP,                             or O/U#     comparable
1246 FUCOMPP, FICOM,
1247 FICOMP
1248
1249 FXAM                Operand class          Sign        Operand class
1250                                            or O/U#
1251
1252 FCHS, FABS,
1253 FXCH, FINCTOP,
1254 FDECTOP, Constant   UNDEFINED              Zero        UNDEFINED
1255 loads, FXTRACT,                            or O/U#
1256 FLD, FILD, FBLD,
1257 FSTP (ext real)
1258
1259 FIST, FBSTP,
1260 FRNDINT, FST,
1261 FSTP, FADD, FMUL,
1262 FDIV, FDIVR, FSUB,  UNDEFINED              Roundup     UNDEFINED
1263 FSUBR, FSCALE,                             or O/U#
1264 FSQRT, FPATAN,
1265 F2XM1, FYL2X,
1266 FYL2XP1
1267
1268 FPTAN, FSIN,        UNDEFINED              Roundup     Reduction
1269 FCOS, FSINCOS                              or O/U#     0=complete
1270                                            undefined   1=incomplete
1271                                            if C2=1
1272
1273 FLDENV, FRSTOR      Each bit loaded
1274                     from memory
1275
1276
1277 FLDCW, FSTENV,
1278 FSTCW, FSTSW,       UNDEFINED
1279 FCLEX, FINIT,
1280 FSAVE
1281
1282
1283 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1284 NOTES
1285   O/U#        When both IE and SF bits of status word are set,
1286               indicating a stack exception, this bit distinguishes
1287               between stack overflow (C1=1) and underflow (C1=0).
1288
1289   Reduction   If FPREM and FPREM1 produces a remainder that is less
1290               than the modulus, reduction is complete.  When reduction
1291               is incomplete the value at the top of the stack is a
1292               partial remainder, which can be used as input to further
1293               reduction. For FPTAN, FSIN, FCOS, and FSINCOS, the
1294               reduction bit is set if the operand at the top of the
1295               stack is too large. In this case the original operand
1296               remains at the top of the stack.
1297
1298   Roundup     When the PE bit of the status word is set, this bit
1299               indicates whether the last rounding in the instruction
1300               was upward.
1301
1302   UNDEFINED   Do not rely on finding any specific value in these bits.
1303 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1304
1305
1306 Table 2-2.  Correspondence between 80387 and 80386 Flag Bits
1307
1308 80387 Flag                    80386 Flag
1309
1310 C{0}                          CF
1311 C{1}                          (none)
1312 C{2}                          PF
1313 C{3}                          ZF
1314
1315
1316 2.1.3  Control Word
1317
1318 The NPX provides the programmer with several processing options, which are
1319 selected by loading a word from memory into the control word. Figure 2-3
1320 shows the format and encoding of the fields in the control word.
1321
1322 The low-order byte of this control word configures the 80387 exception
1323 masking. Bits 0-5 of the control word contain individual masks for each of
1324 the six exception conditions recognized by the 80387. The high-order byte of
1325 the control word configures the 80387 processing options, including
1326
1327   Ž  Precision control
1328   Ž  Rounding control
1329
1330 The precision-control bits (bits 8-9) can be used to set the 80387 internal
1331 operating precision at less than the default precision (64-bit significand).
1332 These control bits can be used to provide compatibility with the
1333 earlier-generation arithmetic processors having less precision than the
1334 80387. The precision-control bits affect the results of only the following
1335 five arithmetic instructions: ADD, SUB(R), MUL, DIV(R), and SQRT. No other
1336 operations are affected by PC.
1337
1338 The rounding-control bits (bits 10-11) provide for the common
1339 round-to-nearest mode, as well as directed rounding and true chop. Rounding
1340 control affects only the arithmetic instructions (refer to Chapter 3 for
1341 lists of arithmetic and nonarithmetic instructions).
1342
1343
1344 Figure 2-3.  80387 Control Word Format
1345
1346          ’‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘RESERVED
1347          �   �   �   ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ (INFINITY CONTROL)
1348 This "infinity control" bit is not meaningful to the 80387. To maintain
1349 compatibility with the 80287, this bit can be programmed; however,
1350 regardless of its value, the 80387 treats infinity in the affine sense
1351 (-ý < +ý).
1352          �   �   �   �   ’‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ ROUNDING CONTROL
1353          �   �   �   �   �   �   ’‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ PRECISION CONTROL
1354          \x1f   \x1f   \x1f   \x1f   \x1f   \x1f   \x1f   \x1f
1355         15                               7                            0
1356        ‚���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���Ð���ƒ
1357        € X   X   X � X �  RC   �  PC   � X   X � P � U � O � Z � D � I €
1358        €   �   �   �   �   �   �   �   �   �   � M � M � M � M � M � M €
1359        „���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���¤���…
1360                                          \x1e   \x1e   \x1e   \x1e   \x1e   \x1e   \x1e   \x1e
1361        RESERVED ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘•   �   �   �   �   �   �
1362        EXECEPTION MASKS                          �   �   �   �   �   �
1363          PRECISION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �   �   �   �
1364          UNDERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �   �   �
1365          OVERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �   �
1366          ZERO DIVIDE ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �   �
1367          DENORMALIZED OPERAND ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•   �
1368          INVALID OPERATION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•
1369
1370 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1371 NOTE:
1372      PRECISION CONTROL                   ROUNDING CONTROL
1373        00--24 BITS (SINGLE PRECISION)      00--ROUND TO NEAREST OR EVEN
1374        01--(RESERVED)                      01--ROUND DOWN (TOWARD -ý)
1375        10--53 BITS (DOUBLE PRECISION)      10--ROUND UP (TOWARD +ý)
1376        11--64 BITS (EXTENDED PRECISION)    11--CHOP (TRUNCATE TOWARDS ZERO)
1377 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1378
1379
1380 2.1.4  The NPX Tag Word
1381
1382 The tag word indicates the contents of each register in the register stack,
1383 as shown in Figure 2-4. The tag word is used by the NPX itself to
1384 distinguish between empty and nonempty register locations. Programmers of
1385 exception handlers may use this tag information to check the contents of a
1386 numeric register without performing complex decoding of the actual data in
1387 the register. The tag values from the tag word correspond to physical
1388 registers 0-7. Programmers must use the current top-of-stack (TOP) pointer
1389 stored in the NPX status word to associate these tag values with the
1390 relative stack registers ST(0) through ST(7).
1391
1392 The exact values of the tags are generated during execution of the FSTENV
1393 and FSAVE instructions according to the actual contents of the nonempty
1394 stack locations. During execution of other instructions, the 80387 updates
1395 the TW only to indicate whether a stack location is empty or nonempty.
1396
1397
1398 Figure 2-4.  80387 Tag Word Format
1399
1400     15                                                                   0
1401   ‚����Ð���Ð����Ð���Ð����Ð���Ð����Ð���Ð����Ð���Ð����Ð���Ð����Ð���Ð����Ð���ƒ
1402   € TAG (7)� TAG (6)� TAG (5)� TAG (4)� TAG (3)� TAG (2)� TAG (1)� TAG (0)€
1403   „����¤���¤����¤���¤����¤���¤����¤���¤����¤���¤����¤���¤����¤���¤����¤���…
1404                               TAG VALUES:
1405                               00 = VALID
1406                               01 = ZERO
1407                               10 = INVALID OR INFINITY
1408                               11 = EMPTY
1409
1410
1411 2.1.5  The NPX Instruction and Data Pointers
1412
1413 The instruction and data pointers provide support for programmed
1414 exception-handlers. These registers are actually located in the 80386, but
1415 appear to be located in the 80387 because they are accessed by the ESC
1416 instructions FLDENV, FSTENV, FSAVE, and FRSTOR. Whenever the 80386 decodes
1417 an ESC instruction, it saves the instruction address, the operand address
1418 (if present), and the instruction opcode.
1419
1420 When stored in memory, the instruction and data pointers appear in one of
1421 four formats, depending on the operating mode of the 80386 (protected mode
1422 or real-address mode) and depending on the operand-size attribute in effect
1423 (32-bit operand or 16-bit operand). When the 80386 is in virtual-8086 mode,
1424 the real-address mode formats are used.
1425
1426 Figures 2-5 through 2-8 show these pointers as they are stored following an
1427 FSTENV instruction.
1428
1429 The FSTENV and FSAVE instructions store this data into memory, allowing
1430 exception handlers to determine the precise nature of any numeric exceptions
1431 that may be encountered.
1432
1433 The instruction address saved in the 80386 (as in the 80287) points to any
1434 prefixes that preceded the instruction. This is different from the 8087, for
1435 which the instruction address points only to the ESC instruction opcode.
1436
1437 Note that the processor control instructions FINIT, FLDCW, FSTCW, FSTSW,
1438 FCLEX, FSTENV, FLDENV, FSAVE, FRSTOR, and FWAIT do not affect the data
1439 pointer. Note also that, except for the instructions just mentioned, the
1440 value of the data pointer is undefined if the prior ESC instruction did not
1441 have a memory operand.
1442
1443
1444 Figure 2-5.  Protected Mode 80387 Instruction and Data Pointer Image in
1445              Memory, 32-Bit Format
1446
1447                         32-BIT PROTECTED MODE FORMAT
1448
1449  31                23                15                7               0
1450 ‚�����������������Ï�����������������Ï�����������������Ï�����������������ƒ
1451 €             RESERVED              �            CONTROL WORD           €0H
1452 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1453 €             RESERVED              �            STATUS WORD            €4H
1454 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1455 €             RESERVED              �              TAG WORD             €8H
1456 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1457 €                               IP OFFSET                               €CH
1458 Ã‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1459 € 0 0 0 0 0 �     OPCODE 10..0      �            CS SELECTOR            €10H
1460 Ã‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1461 €                          DATA OPERAND OFFSET                          €14H
1462 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1463 €             RESERVED              �         OPERAND SELECTOR          €18H
1464 „�����������������Ï�����������������Ï�����������������Ï�����������������…
1465
1466
1467 Figure 2-6.  Real Mode 80387 Instruction and Data Pointer Image in
1468              Memory, 32-Bit Format
1469
1470                       32-BIT REAL ADDRESS MODE FORMAT
1471
1472  31                23                15                7               0
1473 ‚�����������������Ï�����������������Ï�����������������Ï�����������������ƒ
1474 €             RESERVED              �            CONTROL WORD           €0H
1475 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1476 €             RESERVED              �            STATUS WORD            €4H
1477 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1478 €             RESERVED              �              TAG WORD             €8H
1479 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1480 €             RESERVED              �     INSTRUCTION POINTER 15..0     €CH
1481 Ã‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1482 € 0 0 0 0 �      INSTRUCTION POINTER 31..16     �0�     OPCODE 10..0    €10H
1483 Ã‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘™‘™‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1484 €             RESERVED              �         OPERAND POINTER           €14H
1485 Ã‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1486 € 0 0 0 0 �        OPERAND POINTER 31..16       �0 0 0 0 0 0 0 0 0 0 0 0€18H
1487 „���������¤�������Ï�����������������Ï�����������¤�����Ï�����������������…
1488
1489
1490 Figure 2-7.  Protected Mode 80387 Instruction and Data Pointer Image in
1491              Memory, 16-Bit Format
1492
1493                         16-BIT PROTECTED MODE FORMAT
1494
1495                      15                7              0
1496                     ‚�����������������Ï����������������ƒ
1497                     €           CONTROL WORD           € 0H
1498                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1499                     €            STATUS WORD           € 2H
1500                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1501                     €             TAG WORD             € 4H
1502                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1503                     €             IP OFFSET            € 6H
1504                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1505                     €            CB SELECTOR           € 8H
1506                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1507                     €          OPERAND OFFSET          € AH
1508                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1509                     €         OPERAND SELECTOR         € CH
1510                     „�����������������Ï����������������…
1511
1512
1513 Figure 2-8.  Real Mode 80387 Instruction and Data Pointer Image in
1514              Memory, 16-Bit Format
1515
1516                           16-BIT REAL-ADDRESS MODE
1517                         AND VIRTUAL-8086 MODE FORMAT
1518
1519                      15                7              0
1520                     ‚����������������Ï����������������ƒ
1521                     €          CONTROL WORD           € 0H
1522                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1523                     €           STATUS WORD           € 2H
1524                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1525                     €            TAG WORD             € 4H
1526                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1527                     €    INSTRUCTION POINTER 15..0    € 6H
1528                     Ã‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1529                     €IP 19..16�0�      OPCODE 10..0   € 8H
1530                     Ã‘‘‘‘‘‘‘‘‘™‘™‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1531                     €      OPERAND POINTER 15..0      € AH
1532                     Ã‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
1533                     €OP 19..16�0�0 0 0 0 0 0 0 0 0 0 0€ CH
1534                     „���������¤�¤����Ï����������������…
1535
1536
1537 2.2  Computation Fundamentals
1538
1539 This section covers 80387 programming concepts that are common to all
1540 applications. It describes the 80387's internal number system and the
1541 various types of numbers that can be employed in NPX programs. The most
1542 commonly used options for rounding and precision (selected by fields in the
1543 control word) are described, with exhaustive coverage of less frequently
1544 used facilities deferred to later sections. Exception conditions that may
1545 arise during execution of NPX instructions are also described along with the
1546 options that are available for responding to these exceptions.
1547
1548
1549 2.2.1  Number System
1550
1551 The system of real numbers that people use for pencil and paper
1552 calculations is conceptually infinite and continuous. There is no upper or
1553 lower limit to the magnitude of the numbers one can employ in a calculation,
1554 or to the precision (number of significant digits) that the numbers can
1555 represent. When considering any real number, there are always arbitrarily
1556 many numbers both larger and smaller. There are also arbitrarily many
1557 numbers between (i.e., with more significant digits than) any two real
1558 numbers. For example, between 2.5 and 2.6 are 2.51, 2.5897, 2.500001, etc.
1559
1560 While ideally it would be desirable for a computer to be able to operate on
1561 the entire real number system, in practice this is not possible. Computers,
1562 no matter how large, ultimately have fixed-size registers and memories that
1563 limit the system of numbers that can be accommodated. These limitations
1564 determine both the range and the precision of numbers. The result is a set
1565 of numbers that is finite and discrete, rather than infinite and
1566 continuous. This sequence is a subset of the real numbers that is designed
1567 to form a useful approximation of the real number system.
1568
1569 Figure 2-9 superimposes the basic 80387 real number system on a real number
1570 line (decimal numbers are shown for clarity, although the 80387 actually
1571 represents numbers in binary). The dots indicate the subset of real numbers
1572 the 80387 can represent as data and final results of calculations. The
1573 80387's range of double-precision, normalized numbers is approximately
1574 ±2.23 * 10^(-308) to ±1.80 * 10^(308). Applications that are required to
1575 deal with data and final results outside this range are rare. For reference,
1576 the range of the IBM System 370* is about ±0.54 * 10^(-78) to
1577 ±0.72 * 10^(76).
1578
1579 The finite spacing in Figure 2-9 illustrates that the NPX can represent a
1580 great many, but not all, of the real numbers in its range. There is always a
1581 gap between two adjacent 80387 numbers, and it is possible for the result of
1582 a calculation to fall in this space. When this occurs, the NPX rounds the
1583 true result to a number that it can represent. Thus, a real number that
1584 requires more digits than the 80387 can accommodate (e.g., a 20-digit
1585 number) is represented with some loss of accuracy. Notice also that the
1586 80387's representable numbers are not distributed evenly along the real
1587 number line. In fact, an equal number of representable numbers exists
1588 between successive powers of 2 (i.e., as many representable numbers exist
1589 between 2 and 4 as between 65,536 and 131,072). Therefore, the gaps between
1590 representable numbers are larger as the numbers increase in magnitude. All
1591 integers in the range ±2^(64) (approximately ±10^(18)), however, are exactly
1592 representable.
1593
1594 In its internal operations, the 80387 actually employs a number system that
1595 is a substantial superset of that shown in Figure 2-9. The internal format
1596 (called extended real) extends the 80387's range to about ±3.30 * 10^(-4932)
1597 to ±1.21 * 10^(4932), and its precision to about 19 (equivalent decimal)
1598 digits. This format is designed to provide extra range and precision for
1599 constants and intermediate results, and is not normally intended for data
1600 or final results.
1601
1602 From a practical standpoint, the 80387's set of real numbers is
1603 sufficiently large and dense so as not to limit the vast majority of
1604 microprocessor applications. Compared to most computers, including
1605 mainframes, the NPX provides a very good approximation of the real number
1606 system. It is important to remember, however, that it is not an exact
1607 representation, and that arithmetic on real numbers is inherently
1608 approximate.
1609
1610 Conversely, and equally important, the 80387 does perform exact arithmetic
1611 on integer operands. That is, if an operation on two integers is valid and
1612 produces a result that is in range, the result is exact. For example, 4 ÷ 2
1613 yields an exact integer, 1 ÷ 3 does not, and 2^(40) * 2^(30) + 1 does not,
1614 because the result requires greater than 64 bits of precision.
1615
1616
1617 Figure 2-9.  80387 Double-Precision Number System
1618
1619  |\x11‘‘‘ NEGATIVE RANGE (NORMALIZED) ‘‘\x10|
1620  |                                    |
1621  |               -5  -4  -3  -2  -1   |
1622  ’‘‘‘˜‘‘‘˜‘‘˜“’‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“
1623  �   �   �  ���›››�›››�œœœ�œœœ���������
1624  ”‘‘‘™‘‘‘™‘‘™•”‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘•
1625  \x1e                                    \x1e
1626  �                   -2.23 X 10^(-308)•
1627  ” -1.80 X 10^(308)
1628                                          ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“
1629                                          �   ‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘            �
1630                                          �   ��������œœœœœœœœœ            �
1631 |\x11‘‘ POSITIVE RANGE (NORMALIZED) ‘‘‘\x10|   �   ��������œœœœœœœœœ            �
1632 |                                    |   �   ‘¨‘‘‘‘‘¨‘‘‘‘‘¨‘‘‘            �
1633 |   1   2   3   4   5                |   �    �\x11‘˜‘\x10�                     �
1634 ’‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“’˜‘‘˜‘‘‘˜‘‘‘“   �    �  �  ”2.00000000000000000  �
1635 ���������œœœ�œœœ�›››�›››���  �   �   �   �    �  ” (NOT REPRESENTABLE)    �
1636 ”‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘•”™‘‘™‘‘‘™‘‘‘•   �    ”‘‘‘‘‘‘1.99999999999999999  �
1637 \x1e     ”‘‘‘—                          \x1e   �  PRECISION�\x11‘  18 DIGITS  ‘\x10�  �
1638 �         ”‘‘‘‘‘‘‘‘“  1.80 X 10^(308)•   �                                �
1639 ” 2.23 X 10^(-308) ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•
1640
1641
1642 2.2.2  Data Types and Formats
1643
1644 The 80387 recognizes seven numeric data types for memory-based values,
1645 divided into three classes: binary integers, packed decimal integers, and
1646 binary reals. A later section describes how these formats are stored in
1647 memory (the sign is always located in the highest-addressed byte).
1648
1649 Figure 2-10 summarizes the format of each data type. In the figure, the
1650 most significant digits of all numbers (and fields within numbers) are the
1651 leftmost digits.
1652
1653
1654 Figure 2-10.  80387 Data Formats
1655
1656  ’‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“
1657  �         �         �         �MOST                  HIGHEST ADDRESSED    �
1658  �DATA     � RANGE   �PRECISION�SIGNIFICANT BYTE                   BYTE    �
1659  �FORMATS  �         �         –‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“   �
1660  �         �         �         �7 0�7 0�7 0�7 0�7 0�7 0�7 0�7 0�7 0�7 0�   �
1661  –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘—
1662  �WORD     �         �         –‘‘‘‘‘‘“(TWO'S                              �
1663  �INTEGER  � 10^(4)  � 16 BITS –‘‘‘‘‘‘•COMPLEMENT)                         �
1664  �         �         �         �15   0                                     �
1665  –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1666  �SHORT    �         �         –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“(TWO'S                     �
1667  �INTEGER  � 10^(2)  � 32 BITS –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•COMPLEMENT)                �
1668  �         �         �         �31            0                            �
1669  –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1670  �LONG     �         �         –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“(TWO'S     �
1671  �INTEGER  � 10^(19) � 64 BITS –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•COMPLEMENT)�
1672  �         �         �         �6                             0            �
1673  –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1674  �         �         �         �                MAGNITUDE                  �
1675  �PACKED   �         �         –‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘¨¨¨‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ �
1676  �BCD      � 10^(18) �18 DIGITS�S� X �d{17} d{16}        d{2}  d{1}  d{0}� �
1677  �         �         �         –‘™‘‘‘™‘‘‘‘‘™‘‘‘‘‘™‘¨¨¨‘™‘‘‘‘‘™‘‘‘‘‘™‘‘‘‘‘• �
1678  �         �         �         �    72                                  0  �
1679  –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1680  �         �         �         –‘˜‘‘‘‘‘˜‘‘‘‘‘‘‘“                           �
1681  �SINGLE   � 10^(±38)� 24 BITS �S� BE  � SIGN. �                           �
1682  �PRECISION�         �         –‘™‘‘‘‘‘™‘‘‘‘‘‘‘•                           �
1683  �         �         �         �31     23     0                            �
1684  –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1685  �         �         �         –‘˜‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“           �
1686  �DOUBLE   �10^(±308)� 53 BITS �S� BE     �     SIGNIFICAND    �           �
1687  �PRECISION�         �         –‘™‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•           �
1688  �         �         �         �63       52                   0            �
1689  –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘—
1690  �         �         �         –‘˜‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“�
1691  �EXTENDED �10^(4932)� 64 BITS �S� BE         –‘“        SIGNIFICAND      ��
1692  �PRECISION�         �         –‘™‘‘‘‘‘‘‘‘‘‘‘‘™I™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•�
1693  �         �         �         �79          64 63\x1e                       0 �
1694  ”‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•
1695
1696 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1697 NOTE:
1698    (1) BE = BIASED EXPONENT
1699    (2) S = SIGN BIT (0 = positive, 1 = negative)
1700    (3) d{n} = DECIMAL DIGIT (TWO PER TYPE)
1701    (4) X = BITS HAVE NO SIGNIFICANCE; 80387 IGNORES WHEN LOADING,
1702            ZEROS IN WHEN STORING
1703    (5) \x1e = POSITION OF IMPLICIT BINARY POINT
1704    (6) I = INTEGER BIT OF SIGNIFICAND; STORED IN TEMPORARY REAL,
1705        IMPLICIT IN SINGLE AND DOUBLE PRECISION
1706    (7) EXPONENT BIAS (NORMALIZED VALUES):
1707        SINGLE: 127 (7FH)
1708        DOUBLE: 1023 (3FFH)
1709        EXTENDED REAL: 16383 (3FFFH)
1710    (8) PACKED BCD: (-1)^(S) (D{17}...D{0})
1711    (9) REAL: (-1)^(S) (2^(E-BIAS)) (F{0}F{1}...)
1712 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1713
1714
1715 2.2.2.1  Binary Integers
1716
1717 The three binary integer formats are identical except for length, which
1718 governs the range that can be accommodated in each format. The leftmost bit
1719 is interpreted as the number's sign: 0 = positive and 1 = negative. Negative
1720 numbers are represented in standard two's complement notation (the binary
1721 integers are the only 80387 format to use two's complement). The quantity
1722 zero is represented with a positive sign (all bits are 0). The 80387 word
1723 integer format is identical to the 16-bit signed integer data type of the
1724 80386; the 80387 short integer format is identical to the 32-bit signed
1725 integer data type of the 80386.
1726
1727 The binary integer formats exist in memory only. When used by the 80387,
1728 they are automatically converted to the 80-bit extended real format. All
1729 binary integers are exactly representable in the extended real format.
1730
1731
1732 2.2.2.2  Decimal Integers
1733
1734 Decimal integers are stored in packed decimal notation, with two decimal
1735 digits "packed" into each byte, except the leftmost byte, which carries the
1736 sign bit (0 = positive, 1 = negative). Negative numbers are not stored in
1737 two's complement form and are distinguished from positive numbers only by
1738 the sign bit. The most significant digit of the number is the leftmost
1739 digit. All digits must be in the range 0-9.
1740
1741 The decimal integer format exists in memory only. When used by the 80387,
1742 it is automatically converted to the 80-bit extended real format. All
1743 decimal integers are exactly representable in the extended real format.
1744
1745
1746 2.2.2.3  Real Numbers
1747
1748 The 80387 represents real numbers of the form:
1749
1750   (-1)^(s)2^(E)(b{0\x1e}b{1}b{2}b{3}..b{p-1})
1751
1752 ...where...
1753
1754   s = 0 or 1
1755   E = any integer between Emin and Emax, inclusive
1756   b{i} = 0 or 1
1757   p = number of bits of precision
1758
1759 Table 2-3 summarizes the parameters for each of the three real-number
1760 formats.
1761
1762 The 80387 stores real numbers in a three-field binary format that resembles
1763 scientific, or exponential, notation. The format consists of the following
1764 fields:
1765
1766   Ž  The number's significant digits are held in the significand field,
1767      b{0\x1e} b{1} b{2} b{3}..b{p-1}. (The term "significand" is analogous
1768      to the term "mantissa" used to describe floating point numbers on some
1769      computers.)
1770
1771   Ž  The exponent field, e = E+bias, locates the binary point within the
1772      significant digits (and therefore determines the number's magnitude).
1773      (The term "exponent" is analogous to the term "characteristic" used to
1774      describe floating point numbers on somecomputers.)
1775
1776   Ž  The 1-bit sign field indicates whether the number is positive or
1777      negative. Negative numbers differ from positive numbers only in the
1778      sign bits of their significands.
1779
1780 Table 2-4 shows how the real number 178.125 (decimal) is stored in the
1781 80387 single real format. The table lists a progression of equivalent
1782 notations that express the same value to show how a number can be converted
1783 from one form to another. (The ASM386 and PL/M-386 language translators
1784 perform a similar process when they encounter programmer-defined real number
1785 constants.) Note that not every decimal fraction has an exact binary
1786 equivalent. The decimal number 1/10, for example, cannot be expressed
1787 exactly in binary (just as the number 1/3 cannot be expressed exactly in
1788 decimal). When a translator encounters such a value, it produces a rounded
1789 binary approximation of the decimal value.
1790
1791 The NPX usually carries the digits of the significand in normalized form.
1792 This means that, except for the value zero, the significand contains an
1793 integer bit and fraction bits as follows:
1794
1795 1{\x1e}fff...ff
1796
1797 where {\x1e} indicates an assumed binary point. The number of fraction bits
1798 varies according to the real format: 23 for single, 52 for double, and 63
1799 for extended real. By normalizing real numbers so that their integer bit is
1800 always a 1, the 80387 eliminates leading zeros in small values (�X� < 1).
1801 This technique maximizes the number of significant digits that can be
1802 accommodated in a significand of a given width. Note that, in the single
1803 and double formats, the integer bit is implicit and is not actually stored;
1804 the integer bit is physically present in the extended format only.
1805
1806 If one were to examine only the significand with its assumed binary point,
1807 all normalized real numbers would have values greater than or equal to 1 and
1808 less than 2. The exponent field locates the actual binary point in the
1809 significant digits. Just as in decimal scientific notation, a positive
1810 exponent has the effect of moving the binary point to the right, and a
1811 negative exponent effectively moves the binary point to the left, inserting
1812 leading zeros as necessary. An unbiased exponent of zero indicates that the
1813 position of the assumed binary point is also the position of the actual
1814 binary point. The exponent field, then, determines a real number's
1815 magnitude.
1816
1817 In order to simplify comparing real numbers (e.g., for sorting), the 80387
1818 stores exponents in a biased form. This means that a constant is added to
1819 the true exponent described above. As Table 2-3 shows, the value of this
1820 bias is different for each real format. It has been chosen so as to
1821 force the biased exponent to be a positive value. This allows two real
1822 numbers (of the same format and sign) to be compared as if they are unsigned
1823 binary integers. That is, when comparing them bitwise from left to right
1824 (beginning with the leftmost exponent bit), the first bit position that
1825 differs orders the numbers; there is no need to proceed further with the
1826 comparison. A number's true exponent can be determined simply by
1827 subtracting the bias value of its format.
1828
1829 The single and double real formats exist in memory only. If a number in one
1830 of these formats is loaded into an 80387 register, it is automatically
1831 converted to extended format, the format used for all internal operations.
1832 Likewise, data in registers can be converted to single or double real for
1833 storage in memory. The extended real format may be used in memory also,
1834 typically to store intermediate results that cannot be held in registers.
1835
1836 Most applications should use the double format to store real-number data
1837 and results; it provides sufficient range and precision to return correct
1838 results with a minimum of programmer attention. The single real format is
1839 appropriate for applications that are constrained by memory, but it should
1840 be recognized that this format provides a smaller margin of safety. It is
1841 also useful for the debugging of algorithms, because roundoff problems will
1842 manifest themselves more quickly in this format. The extended real format
1843 should normally be reserved for holding intermediate results, loop
1844 accumulations, and constants. Its extra length is designed to shield final
1845 results from the effects of rounding and overflow/underflow in intermediate
1846 calculations. However, the range and precision of the double format are
1847 adequate for most microcomputer applications.
1848
1849
1850 Table 2-3.  Summary of Format Parameters
1851
1852 Parameter                 ’‘‘‘‘‘‘‘‘ Format ‘‘‘‘‘‘‘‘“
1853                           Single   Double   Extended
1854
1855 Format width in bits          32       64         80
1856 p (bits of precision)         24       53         64
1857 Exponent width in bits         8       11         15
1858 Emax                        +127    +1023     +16383
1859 Emin                        -126    -1022     -16382
1860 Exponent bias               +127    +1023     +16383
1861
1862
1863 Table 2-4.  Real Number Notation
1864
1865 Notation                Value
1866
1867 Ordinary Decimal        178.125
1868 Scientific Decimal      1{\x1e}78125E2
1869 Scientific Binary       1{\x1e}0110010001E111
1870 Scientific Binary       1{\x1e}0110010001E10000110
1871 (Biased Exponent)
1872 80387 Single Format     Sign    Biased Exponent     Significand
1873 (Normalized)            0       10000110            01100100010000000000000
1874                                                     1{\x1e}(implicit)
1875
1876
1877 2.2.3  Rounding Control
1878
1879 Internally, the 80387 employs three extra bits (guard, round, and sticky
1880 bits) that enable it to round numbers in accord with the infinitely precise
1881 true result of a computation; these bits are not accessible to programmers.
1882 Whenever the destination can represent the infinitely precise true result,
1883 the 80387 delivers it. Rounding occurs in arithmetic and store operations
1884 when the format of the destination cannot exactly represent the infinitely
1885 precise true result. For example, a real number may be rounded if it is
1886 stored in a shorter real format, or in an integer format. Or, the infinitely
1887 precise true result may be rounded when it is returned to a register.
1888
1889 The NPX has four rounding modes, selectable by the RC field in the control
1890 word (see Figure 2-3). Given a true result b that cannot be represented by
1891 the target data type, the 80387 determines the two representable numbers a
1892 and c that most closely bracket b in value (a < b < c). The processor then
1893 rounds (changes) b to a or to c according to the mode selected by the RC
1894 field as shown in Table 2-5. Rounding introduces an error in a result that
1895 is less than one unit in the last place to which the result is rounded.
1896
1897   Ž  "Round to nearest" is the default mode and is suitable for most
1898      applications; it provides the most accurate and statistically unbiased
1899      estimate of the true result.
1900
1901   Ž  The "chop" or "round toward zero" mode is provided for integer
1902      arithmeticapplications.
1903
1904   Ž  "Round up" and "round down" are termed directed rounding and can be
1905      used to implement interval arithmetic. Interval arithmetic generates a
1906      certifiable result independent of the occurrence of rounding and other
1907      errors. The upper and lower bounds of an interval may be computed by
1908      executing an algorithm twice, rounding up in one pass and down in the
1909      other.
1910
1911 Rounding control affects only the arithmetic instructions (refer to Chapter
1912 3 for lists of arithmetic and nonarithmetic instructions).
1913
1914
1915 2.2.4  Precision Control
1916
1917 The 80387 allows results to be calculated with either 64, 53, or 24 bits of
1918 precision in the significand as selected by the precision control (PC) field
1919 of the control word. The default setting, and the one that is best suited
1920 for most applications, is the full 64 bits of significance provided by the
1921 extended real format. The other settings are required by the IEEE standard
1922 and are provided to obtain compatibility with the specifications of certain
1923 existing programming languages. Specifying less precision nullifies the
1924 advantages of the extended format's extended fraction length. When reduced
1925 precision is specified, the rounding of the fractional value clears the
1926 unused bits on the right to zeros.
1927
1928
1929 Table 2-5.  Rounding Modes
1930
1931 RC Field    Rounding Mode            Rounding Action
1932
1933 00          Round to nearest         Closer to b of a or c; if equally
1934                                      close, select even number (the one
1935                                      whose least significant bit is zero).
1936 01          Round down (toward -ý)   a
1937 10          Round up (toward +ý)     c
1938 11          Chop (toward 0)          Smaller in magnitude of a or c.
1939
1940 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1941 NOTE
1942   a < b < c; a and c are successive representable numbers; b is not
1943   representable.
1944 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1945
1946
1947 Chapter 3  Special Computational Situations
1948
1949 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
1950
1951 Besides being able to represent positive and negative numbers, the 80387
1952 data formats may be used to describe other entities. These special values
1953 provide extra flexibility, but most users will not need to understand them
1954 in order to use the 80387 successfully. This section describes the special
1955 values that may occur in certain cases and the significance of each. The
1956 80387 exceptions are also described, for writers of exception handlers and
1957 for those interested in probing the limits of computation using the 80387.
1958
1959 The material presented in this section is mainly of interest to programmers
1960 concerned with writing exception handlers. Many readers will only need to
1961 skim this section.
1962
1963 When discussing these special computational situations, it is useful to
1964 distinguish between arithmetic instructions and nonarithmetic instructions.
1965 Nonarithmetic instructions are those that have no operands or transfer their
1966 operands without substantial change; arithmetic instructions are those that
1967 make significant changes to their operands. Table 3-1 defines these two
1968 classes of instructions.
1969
1970
1971 Table 3-1.  Arithmetic and Nonarithmetic Instructions
1972
1973
1974 Nonarithmetic Instructions             Arithmetic Instructions
1975
1976 FABS                                   F2XM1
1977 FCHS                                   FADD (P)
1978 FCLEX                                  FBLD
1979 FDECSTP                                FBSTP
1980 FFREE                                  FCOMP(P)(P)
1981 FINCSTP                                FCOS
1982 FINIT                                  FDIV(R)(P)
1983 FLD (register-to-register)             FIADD
1984 FLD (extended format from memory)      FICOM(P)
1985 FLD constant                           FIDIV(R)
1986 FLDCW                                  FILD
1987 FLDENV                                 FIMUL
1988 FNOP                                   FIST(P)
1989 FRSTOR                                 FISUB(R)
1990 FSAVE                                  FLD (conversion)
1991 FST(P) (register-to-register)          FMUL(P)
1992 FSTP (extended format to memory)       FPATAN
1993 FSTCW                                  FPREM
1994 FSTENV                                 FPREM1
1995 FSTSW                                  FPTAN
1996 FWAIT                                  FRNDINT
1997 FXAM                                   FSCALE
1998 FXCH                                   FSIN
1999                                        FSINCOS
2000                                        FSQRT
2001                                        FST(P) (conversion)
2002                                        FSUB(R)(P)
2003                                        FTST
2004                                        FUCOM(P)(P)
2005                                        FXTRACT
2006                                        FYL2X
2007                                        FYL2XP1
2008
2009
2010
2011 3.1  Special Numeric Values
2012
2013 The 80387 data formats encompass encodings for a variety of special values
2014 in addition to the typical real or integer data values that result from
2015 normal calculations. These special values have significance and can express
2016 relevant information about the computations or operations that produced
2017 them. The various types of special values are
2018
2019   Ž  Denormal real numbers
2020   Ž  Zeros
2021   Ž  Positive and negative infinity
2022   Ž  NaN (Not-a-Number)
2023   Ž  Indefinite
2024   Ž  Unsupported formats
2025
2026 The following sections explain the origins and significance of each of
2027 these special values. Tables 3-6 through 3-9 at the end of this section
2028 show how each of these special values is encoded for each of the numeric
2029 data types.
2030
2031
2032 3.1.1  Denormal Real Numbers
2033
2034 The 80387 generally stores nonzero real numbers in normalized
2035 floating-point form; that is, the integer (leading) bit of the significand
2036 is always a one. (Refer to Chapter 2 for a review of operand formats.) This
2037 bit is explicitly stored in the extended format, and is implicitly assumed
2038 to be a one (1{\x1e}) in the single and double formats. Since leading zeros are
2039 eliminated, normalized storage allows the maximum number of significant
2040 digits to be held in a significand of a given width.
2041
2042 When a numeric value becomes very close to zero, normalized floating-point
2043 storage cannot be used to express the value accurately. The term tiny is
2044 used here to precisely define what values require special handling by the
2045 80387. A number R is said to be tiny when -2{Emin} < R < 0 or
2046 0 < R < +2{Emin}. (As defined in Chapter 2, Emin is -126 for single format,
2047 -1022 for double format, and -16382 for extended format.) In other words, a
2048 nonzero number is tiny if its exponent would be too negative to store in the
2049 destination format.
2050
2051 To accommodate these instances, the 80387 can store and operate on reals
2052 that are not normalized, i.e., whose significands contain one or more
2053 leading zeros. Denormals typically arise when the result of a calculation
2054 yields a value that is tiny.
2055
2056 Denormal values have the following properties:
2057
2058   Ž  The biased floating-point exponent is stored at its smallest value
2059      (zero)
2060
2061   Ž  The integer bit of the significand (whether explicit or implicit) is
2062      zero
2063
2064 The leading zeros of denormals permit smaller numbers to be represented, at
2065 the possible cost of some lost precision (the number of significant bits is
2066 reduced by the leading zeros). In typical algorithms, extremely small values
2067 are most likely to be generated as intermediate, rather than final, results.
2068 By using the NPX's extended real format for holding intermediate values,
2069 quantities as small as ±3.4*10{-4932} can be represented; this makes the
2070 occurrence of denormal numbers a rare phenomenon in 80387 applications.
2071 Nevertheless, the NPX can load, store, and operate on denormalized real
2072 numbers when they do occur.
2073
2074 Denormals receive special treatment by the 80387 in three respects:
2075
2076   Ž  The 80387 avoids creating denormals whenever possible. In other words,
2077      it always normalizes real numbers except in the case of tiny numbers.
2078
2079   Ž  The 80387 provides the unmasked underflow exception to permit
2080      programmers to detect cases when denormals would be created.
2081
2082   Ž  The 80387 provides the denormal exception to permit programmers to
2083      detect cases when denormals enter into further calculations.
2084
2085 Denormalizing means incrementing the true result's exponent and inserting a
2086 corresponding leading zero in the significand, shifting the rest of the
2087 significand one place to the right. Denormal values may occur in any of the
2088 single, double, or extended formats. Table 3-2 illustrates how a result
2089 might be denormalized to fit a single format destination.
2090
2091 Denormalization produces either a denormal or a zero. Denormals are readily
2092 identified by their exponents, which are always the minimum for their
2093 formats; in biased form, this is always the bit string: 00..00. This same
2094 exponent value is also assigned to the zeros, but a denormal has a nonzero
2095 significand. A denormal in a register is tagged special. Tables 3-8 and
2096 3-9 show how denormal values are encoded in each of the real data formats.
2097
2098 The denormalization process causes loss of significance if low-order
2099 one-bits bits are shifted off the right of the significand. In a severe
2100 case, all the significand bits of the true result are shifted out and
2101 replaced by the leading zeros. In this case, the result of denormalization
2102 is a true zero, and, if the value is in a register, it is tagged as a zero.
2103
2104 Denormals are rarely encountered in most applications. Typical debugged
2105 algorithms generate extremely small results during the evaluation of
2106 intermediate subexpressions; the final result is usually of an appropriate
2107 magnitude for its single or double format real destination. If intermediate
2108 results are held in temporary real, as is recommended, the great range of
2109 this format makes underflow very unlikely. Denormals are likely to arise
2110 only when an application generates a great many intermediates, so many that
2111 they cannot be held on the register stack or in extended format memory
2112 variables. If storage limitations force the use of single or double format
2113 reals for intermediates, and small values are produced, underflow may occur,
2114 and, if masked, may generate denormals.
2115
2116 When a denormal number is single or double format is used as a source
2117 operand and the denormal exception is masked, the 80387 automatically
2118 normalizes the number when it is converted to extended format.
2119
2120
2121 Table 3-2.  Denormalization Process
2122
2123 Operation          Sign    Exponent    Significand
2124
2125 True Result        0       -129        1{\x1e}01011100..00
2126 Denormalize        0       -128        0{\x1e}101011100..00
2127 Denormalize        0       -127        0{\x1e}0101011100..00
2128 Denormalize        0       -126        0{\x1e}00101011100..00
2129 Denormal Result    0       -126        0{\x1e}00101011100..00
2130
2131
2132 3.1.1.1  Denormals and Gradual Underflow
2133
2134 Floating-point arithmetic cannot carry out all operations exactly for all
2135 operands; approximation is unavoidable when the exact result is not
2136 representable as a floating-point variable. To keep the approximation
2137 mathematically tractable, the hardware is made to conform to accuracy
2138 standards that can be modeled by certain inequalities instead of equations.
2139 Let the assignment
2140
2141   X \e Y @ Z    (where @ is some operation)
2142
2143 represent a typical operation. In the default rounding mode (round to
2144 nearest), each operation is carried out with an absolute error no larger
2145 than half the separation between the two floating-point numbers closest to
2146 the exact results. Let x be the value stored for the variable whose name in
2147 the program is X, and similarly y for Y, and z for Z. Normally y and z will
2148 differ by accumulated errors from what is desired and from what would have
2149 been obtained in the absence of error. For the calculation of x we assume
2150 that y and z are the best approximations available, and we seek to compute x
2151 as well as we can. If y@z is representable exactly, then we expect x = y@z,
2152 and that is what we get for every algebraic operation on the 80387 (i.e.,
2153 when y@z is one of y+z, y-z, y*z, y÷z, sqrt z). But if y@z must be
2154 approximated, as is usually the case, then x must differ from y@z by no
2155 more than half the difference between the two representable numbers that
2156 straddle y@z. That difference depends on two factors:
2157
2158   1.  The precision to which the calculation is carried out, as determined
2159       either by the precision control bits or by the format used in memory.
2160       On the 80387, the precisions are single (24 significant bits), double
2161       (53 significant bits), and extended (64 significant bits).
2162
2163   2.  How close y@z is to zero. In this respect the presence of denormal
2164       numbers on the 80387 provides a distinct advantage over systems that
2165       do not admit denormal numbers.
2166
2167 In any floating-point number system, the density of representable numbers
2168 is greater near zero than near the largest representable magnitudes.
2169 However, machines that do not use denormal numbers suffer from an enormous
2170 gap between zero and its closest neighbors. Figures 3-1 and 3-2 show what
2171 happens near zero in two kinds of floating-point number systems.
2172
2173 Figure 3-1 shows a floating-point number system that (like the 80387)
2174 admits denormal numbers. For simplicity, only the non-negative numbers
2175 appear and the figure illustrates a number system that carries just four
2176 significant bits instead of the 24, 53, or 64 significant bits that the
2177 80387 offers.
2178
2179 Each vertical mark stands for a number representable in four significant
2180 bits, and the bolder marks stand for the normal powers of 2. The denormal
2181 numbers lie between 0 and the nearest normal power of 2. They are no less
2182 dense than the remaining normal nonzero numbers.
2183
2184 Figure 3-2 shows a floating-point number system that (unlike the 80387)
2185 does not admit denormal numbers. There are two yawning gaps, one on the
2186 positive side of zero (as illustrated) and one on the negative side of zero
2187 (not illustrated). The gap between zero and the nearest neighbor of zero
2188 differs from the gap between that neighbor and the next bigger number by a
2189 factor of about 8.4 * 10^(6) for single, 4.5 * 10^(15) for double, and
2190 9.2*10^(18) for extended format. Those gaps would horribly complicate error
2191 analysis.
2192
2193 The advantage of denormal numbers is apparent when one considers what
2194 happens in either case when the underflow exception is masked and y@z falls
2195 into the space between zero and the smallest normal magnitude. The 80387
2196 returns the nearest denormal number. This action might be called "gradual
2197 underflow." The effect is no different than the rounding that can occur when
2198 y@z falls in the normal range.
2199
2200 On the other hand, the system that does not have denormal numbers returns
2201 zero as the result, an action that can be much more inaccurate than
2202 rounding. This action could be called "abrupt underflow."
2203
2204
2205 Figure 3-1.  Floating-Point System with Denormals
2206
2207  0+++++++�+++++++�-+-+-+-+-+-+-+-�---+---+---+---+---+---+---+---�------+...
2208
2209   ”‘‘˜‘‘• - - - - - - - - Normal Numbers - - - - - -\x10
2210   Denormals
2211
2212
2213 Figure 3-2.  Floating-Point System without Denormals
2214
2215  0    �+++++++�-+-+-+-+-+-+-+-�---+---+---+---+---+---+---+---�------+---...
2216
2217        - - - - - - - - Normal Numbers - - - - - -\x10
2218
2219
2220 3.1.2  Zeros
2221
2222 The value zero in the real and decimal integer formats may be signed either
2223 positive or negative, although the sign of a binary integer zero is always
2224 positive. For computational purposes, the value of zero always behaves
2225 identically, regardless of sign, and typically the fact that a zero may be
2226 signed is transparent to the programmer. If necessary, the FXAM instruction
2227 may be used to determine a zero's sign.
2228
2229 If a zero is loaded or generated in a register, the register is tagged
2230 zero. Table 3-3 lists the results of instructions executed with zero
2231 operands and also shows how a zero may be created from nonzero operands.
2232
2233
2234 Table 3-3.  Zero Operands and Results
2235
2236
2237 Key to symbols used in this table
2238 X and Y denote nonzero operand.
2239 *  Sign of original zero operand.
2240 #  Sign of original X operand.
2241 -# Compliment of sign of original X operand.
2242 Þ  Exclusive OR of the signs of the operands.
2243
2244
2245 Operation          Operands                    Result
2246
2247 FLD,FBLD           +0                          +0
2248                    -0                          -0
2249 FILD               +0                          +0
2250 FST,FSTP           +0                          +0
2251                    -0                          -0
2252                    +X                          +0
2253 When extreme underflow denormalizes the result to zero.
2254
2255
2256                    -X                          -0
2257 When extreme underflow denormalizes the result to zero.
2258
2259
2260 FBSTP              +0                          +0
2261                    -0                          -0
2262 FIST,FISTP         +0                          +0
2263                    -0                          -0
2264                    +X                          +0
2265 When 0 < X < 1 and rounding mode is not up.
2266
2267
2268                    -X                          -0
2269 When 0 < X < 1 and rounding mode is not up.
2270
2271
2272 Addition           +0 plus +0                  +0
2273                    -0 plus -0                  -0
2274                    +0 plus -0, -0 plus +0      ±0
2275 Sign determined by rounding mode: + for nearest, up, or chop, - for down.
2276
2277
2278                    -X plus +X, +X plus -X      ±0
2279 Sign determined by rounding mode: + for nearest, up, or chop, - for down.
2280
2281
2282                    ±0 plus ±X, ±X plus ±0      #X
2283 Subtraction        +0 minus                    -0+0
2284                    -0 minus +0                 -0
2285                    +0 minus +0, -0 minus -0    ±0
2286 Sign determined by rounding mode: + for nearest, up, or chop, - for down.
2287
2288
2289                    +X minus +X, -X minus -X    ±0
2290 Sign determined by rounding mode: + for nearest, up, or chop, - for down.
2291
2292
2293                    ±0 minus ±X                 -#X
2294                    ±X minus ±0                 #X
2295 Multiplication     +0 * +0, -0 * -0            +0
2296                    +0 * -0, -0 * +0            -0
2297                    +0 * +X, +X * +0            +0
2298                    +0 * -X, -X * +0            -0
2299                    -0 * +X, -X * +0            -0
2300 Multiplication     -0 * -X, -X * -0            +0
2301                    +X * +Y, -X * -Y            +0
2302 When extreme underflow denormalizes the result to zero.
2303
2304
2305                    +X * -Y, -X * +Y            -0
2306 When extreme underflow denormalizes the result to zero.
2307
2308
2309 Division           ±0 ÷ ±0                     Invalid Operation
2310                    ±X ÷ ±0                     Þý (Zero Divide)
2311                    +0 ÷ +X, -0 ÷ -X            +0
2312                    +0 ÷ -X, -0 ÷ +X            -0
2313                    -X ÷ -Y, +X ÷ +Y            +0
2314 When extreme underflow denormalizes the result to zero.
2315
2316
2317                    -X ÷ +Y, +X ÷ -Y            -0
2318 When extreme underflow denormalizes the result to zero.
2319
2320
2321 FPREM, FPREM1      ±0 rem ±0                   Invalid Operation
2322                    ±X rem ±0                   Invalid Operation
2323                    +0 rem ±X                   +0
2324                    -0 rem ±X                   -0
2325 FPREM              +X rem ±Y                   +0 Y exactly divides X
2326                    -X rem ±Y                   -0 Y exactly divides X
2327 FPREM1             +X rem ±Y                   +0 Y exactly divides X
2328                    -X rem ±Y                   -0 Y exactly divides X
2329 FSQRT              +0                          +0
2330                    -0                          -0
2331 Compare            ±0 : +X                     ±0 < +X
2332                    ±0 : ±0                     ±0 = ±0
2333                    ±0 : -X                     ±0 > -X
2334 FTST               ±0                          ±0 = 0
2335                    +0                          C{3}=1; C{2}=C{1}=C{0}=0
2336                    -0                          C{3}=C{1}=1; C{2}=C{0}=0
2337 FCHS               +0                          -0
2338                    -0                          +0
2339 FABS               ±0                          +0
2340 F2XM1              +0                          +0
2341                    -0                          -0
2342 FRNDINT            +0                          +0
2343                    -0                          -0
2344 FSCALE             ±0 scaled by -ý             *0
2345                    ±0 scaled by +ý             Invalid Operation
2346                    ±0 scaled by X              *0
2347 FXTRACT            +0                          ST=+0,ST(1)=-ý, Zero divide
2348                    -0                          ST=-0,ST(1)=-ý, Zero divide
2349 FPTAN±0            *0
2350 FSIN (or           ±0                          *0
2351 SIN result of
2352 FSINCOS)
2353 FCOS (or           ±0                          +1
2354 COS result of
2355 FSINCOS)
2356 FPATAN             ±0 ÷ +X                     *0
2357                    ±0 ÷ -X                     *Ò
2358                    ±X ÷ ±0                     #Ò/2
2359                    ±0 ÷ +0                     *0
2360                    ±0 ÷ -0                     *Ò
2361                    +ý ÷ ±0                     +Ò/2
2362                    -ý ÷ ±0                     -Ò/2
2363                    ±0 ÷ +ý                     *0
2364                    ±0 ÷ -ý                     *Ò
2365 FYL2X              ±Y * log(±0)                Zero Divide
2366                    ±0 * log(±0)                Invalid Operation
2367 FYL2XP1            +Y * log(±0+1)              *0
2368                    -Y * log(±0+1)              -0
2369
2370
2371 3.1.3  Infinity
2372
2373 The real formats support signed representations of infinities. These values
2374 are encoded with a biased exponent of all ones and a significand of
2375 1{\x1e}00..00; if the infinity is in a register, it is tagged special.
2376
2377 A programmer may code an infinity, or it may be created by the NPX as its
2378 masked response to an overflow or a zero divide exception. Note that
2379 depending on rounding mode, the masked response may create the largest valid
2380 value representable in the destination rather than infinity.
2381
2382 The signs of the infinities are observed, and comparisons are possible.
2383 Infinities are always interpreted in the affine sense; that is, -ý < (any
2384 finite number) < +ý. Arithmetic on infinities is always exact and,
2385 therefore, signals no exceptions, except for the invalid operations
2386 specified in Table 3-4.
2387
2388
2389 Table 3-4.  Infinity Operands and Results
2390
2391
2392 Key to symbols used in this table
2393 X  Zero or nonzero positive oprand.
2394 Y  Nonzero positive operand.
2395 *  Sign of original infinity operand.
2396 -* Compliment of sign of original infinity operand.
2397 $  Sign of original operand.
2398 #  Sign of the original Y operand.
2399 Þ  Exclusive OR of signs of operands.
2400
2401
2402 Operation           Operands            Result
2403
2404 Addition            +ý plus +ý          +ý
2405                     -ý plus -ý          -ý
2406                     +ý plus -ý          Invalid Operation
2407                     -ý plus +ý          Invalid Operation
2408                     ±ý plus ±X          *ý
2409                     ±X plus ±ý          *ý
2410 Subtraction         +ý minus -ý         +ý
2411                     -ý minus +ý         -ý
2412                     +ý minus +ý         Invalid Operation
2413                     -ý minus -ý         Invalid Operation
2414                     ±ý minus ±X         *ý
2415                     ±X minus ±ý         -*ý
2416 Multiplication      ±ý * ±ý             Þý
2417                     ±ý * ±Y, ±Y * ±ý    Þý
2418                     ±0 * ±ý, ±ý * ±0    Invalid Operation
2419 Division            ±ý ÷ ±ý             Invalid Operation
2420                     ±ý ÷ ±X             Þý
2421                     ±X ÷ ±ý             Þ0
2422                     ±ý ÷ ±0             Þý
2423 FSQRT               -ý                  Invalid Operation
2424                     +ý                  +ý
2425 FPREM, FPREM1       ±ý rem ±ý           Invalid Operation
2426                     ±ý rem ±X           Invalid Operation
2427                     ±X rem ±ý           $X, Q = 0
2428 FRNDINT             ±ý                  *ý
2429 FSCALE              ±ý scaled by --ý    Invalid Operation
2430                     ±ý scaled by +ý     *ý
2431                     ±ý scaled by ±X     *ý
2432                     ±0 scaled by -ý     ±0
2433 Sign of original zero operand.
2434
2435
2436                     ±0 scaled by ýI     Invalid Operation
2437                     ±Y scaled by +ý     #ý
2438                     ±Y scaled by -ý     #0
2439 FXTRACT             ±ý                  ST = *ý, ST(1) = +ý
2440 Compare             +ý : +ý             +ý = +ý
2441                     -ý : -ý             -ý = -ý
2442                     +ý : -ý             +ý > -ý
2443                     -ý : +ý             -ý < +ý
2444                     +ý : ±X             +ý > X
2445                     -ý : ±X             -ý < X
2446                     ±X : +ý             X < +ý
2447                     ±X : -ý             X > +ý
2448 FTST                +ý                  +ý > 0
2449                     -ý                  -ý < 0
2450 FPATAN              ±ý ÷ ±X             *Ò/2
2451                     ±Y ÷ +ý             #0
2452                     ±Y ÷ -ý             #Ò
2453                     ±ý ÷ +ý             *Ò/4
2454                     ±ý ÷ -ý             *3Ò/4
2455                     ±ý ÷ ±0             *Ò/2
2456                     +0 ÷ +ý             +0
2457                     +0 ÷ -ý             +Ò
2458                     -0 ÷ +ý             -0
2459                     -0 ÷ -ý             -Ò
2460 F2XM1               +ý                  +ý
2461                     -ý                  -1
2462 FYL2X, FYL2XP1      ±ý * log(1)         Invalid Operation
2463                     ±ý * log(Y>1)       *ý
2464                     ±ý * log(0<Y<1)     -*ý
2465                     ±Y * log(+ý)        #ý
2466                     ±0 * log(+ý)        Invalid Operation
2467                     ±Y * log(-ý)        Invalid Operation
2468
2469
2470 3.1.4  NaN (Not-a-Number)
2471
2472 A NaN (Not a Number) is a member of a class of special values that exists
2473 in the real formats only. A NaN has an exponent of 11..11B, may have either
2474 sign, and may have any significand except 1{\x1e}00..00B, which is assigned to
2475 the infinities. A NaN in a register is tagged special.
2476
2477 There are two classes of NaNs: signaling (SNaN) and quiet (QNaN). Among the
2478 QNaNs, the value real indefinite is of special interest.
2479
2480
2481 3.1.4.1  Signaling NaNs
2482
2483 A signaling NaN is a NaN that has a zero as the most significant bit of its
2484 significand. The rest of the significand may be set to any value. The 80387
2485 never generates a signaling NaN as a result; however, it recognizes
2486 signaling NaNs when they appear as operands. Arithmetic operations (as
2487 defined at the beginning of this chapter) on a signaling NaN cause an
2488 invalid-operation exception (except for load operations, FXCH, FCHS, and
2489 FABS).
2490
2491 By unmasking the invalid operation exception, the programmer can use
2492 signaling NaNs to trap to the exception handler. The generality of this
2493 approach and the large number of NaN values that are available provide the
2494 sophisticated programmer with a tool that can be applied to a variety of
2495 special situations.
2496
2497 For example, a compiler could use signaling NaNs as references to
2498 uninitialized (real) array elements. The compiler could preinitialize each
2499 array element with a signaling NaN whose significand contained the index
2500 (relative position) of the element. If an application program attempted to
2501 access an element that it had not initialized, it would use the NaN placed
2502 there by the compiler. If the invalid operation exception were unmasked, an
2503 interrupt would occur, and the exception handler would be invoked. The
2504 exception handler could determine which element had been accessed, since the
2505 operand address field of the exception pointers would point to the NaN, and
2506 the NaN would contain the index number of the array element.
2507
2508
2509 3.1.4.2  Quiet NaNs
2510
2511 A quiet NaN is a NaN that has a one as the most significant bit of its
2512 significand. The 80387 creates the quiet NaN real indefinite (defined below)
2513 as its default response to certain exceptional conditions. The 80387 may
2514 derive other QNaNs by converting an SNaN. The 80387 converts a SNaN by
2515 setting the most significant bit of its significand to one, thereby
2516 generating an QNaN. The remaining bits of the significand are not changed;
2517 therefore, diagnostic information that may be stored in these bits of the
2518 SNaN is propagated into the QNaN.
2519
2520 The 80387 will generate the special QNaN, real indefinite, as its masked
2521 response to an invalid operation exception. This NaN is signed negative; its
2522 significand is encoded 1{\x1e}100..00. All other NaNs represent values created
2523 by programmers or derived from values created by programmers.
2524
2525 Both quiet and signaling NaNs are supported in all operations. A QNaN is
2526 generated as the masked response for invalid-operation exceptions and as the
2527 result of an operation in which at least one of the operands is a QNaN. The
2528 80387 applies the rules shown in Table 3-5 when generating a QNaN:
2529
2530 Note that handling of a QNaN operand has greater priority than all
2531 exceptions except certain invalid-operation exceptions (refer to the section
2532 "Exception Priority" in this chapter).
2533
2534 Quiet NaNs could be used, for example, to speed up debugging. In its early
2535 testing phase, a program often contains multiple errors. An exception
2536 handler could be written to save diagnostic information in memory whenever
2537 it was invoked. After storing the diagnostic data, it could supply a quiet
2538 NaN as the result of the erroneous instruction, and that NaN could point to
2539 its associated diagnostic area in memory. The program would then continue,
2540 creating a different NaN for each error. When the program ended, the NaN
2541 results could be used to access the diagnostic data saved at the time the
2542 errors occurred. Many errors could thus be diagnosed and corrected in one
2543 test run.
2544
2545
2546 Table 3-5.  Rules for Generating QNaNs
2547
2548 Operation                          Action
2549
2550 Real operation on an SNaN and      Deliver the QNaN operand.
2551 a QNaN
2552
2553 Real operation on two SNaNs        Deliver the QNaN that results from
2554                                    converting the SNaN that has the larger
2555                                    significand.
2556
2557 Real operation on two QNaNs        Deliver the QNaN that has the larger
2558                                    significand.
2559
2560 Real operation on an SNaN and      Deliver the QNaN that results from
2561 another number                     converting the SNaN.
2562
2563 Real operation on a QNaN and       Deliver the QNaN.
2564 another number
2565
2566 Invalid operation that does not    Deliver the default QNaN real indefinite.
2567 involve NaNs
2568
2569
2570 3.1.5  Indefinite
2571
2572 For every 80387 numeric data type, one unique encoding is reserved for
2573 representing the special value indefinite. The 80387 produces this encoding
2574 as its response to a masked invalid-operation exception.
2575
2576 In the case of reals, the indefinite value is a QNaN as discussed in the
2577 prior section.
2578
2579 Packed decimal indefinite may be stored by the NPX in a FBSTP instruction;
2580 attempting to use this encoding in a FBLD instruction, however, will have an
2581 undefined result; thus indefinite cannot be loaded from a packed decimal
2582 integer.
2583
2584 In the binary integers, the same encoding may represent either indefinite
2585 or the largest negative number supported by the format (-2^(15), -2^(31), or
2586 -2^(63)). The 80387 will store this encoding as its masked response to
2587 an invalid operation, or when the value in a source register represents or
2588 rounds to the largest negative integer representable by the destination. In
2589 situations where its origin may be ambiguous, the invalid-operation
2590 exception flag can be examined to see if the value was produced by an
2591 exception response. When this encoding is loaded or used by an integer
2592 arithmetic or compare operation, it is always interpreted as a negative
2593 number; thus indefinite cannot be loaded from a binary integer.
2594
2595
2596 3.1.6  Encoding of Data Types
2597
2598 Tables 3-6 through 3-9 show how each of the special values just
2599 described is encoded for each of the numeric data types. In these tables,
2600 the least-significant bits are shown to the right and are stored in the
2601 lowest memory addresses. The sign bit is always the left-most bit of the
2602 highest-addressed byte.
2603
2604
2605 3.1.7  Unsupported Formats
2606
2607 The extended format permits many bit patterns that do not fall into any of
2608 the previously mentioned categories. Some of these encodings were supported
2609 by the 80287 NPX; however, most of them are not supported by the 80387 NPX.
2610 These changes are required due to changes made in the final version of the
2611 IEEE 754 standard that eliminated these data types.
2612
2613 The categories of encodings formerly known as pseudozeros, pseudo-NaNs,
2614 pseudoinfinities, and unnormal numbers are not supported by the 80387. The
2615 80387 raises the invalid-operation exception when they are encountered as
2616 operands.
2617
2618 The encodings formerly known as pseudodenormal numbers are not generated by
2619 the 80387; however, they are correctly utilized when encountered in operands
2620 to 80387 instructions. The exponent is treated as if it were 00..01 and the
2621 mantissa is unchanged. The denormal exception is raised.
2622
2623
2624 Table 3-6. Binary Integer Encodings
2625
2626               Class                   Sign         Magnitude
2627     ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2628     �         (Largest)                0            11...11
2629     �                                  ¨               ¨
2630 Positives                              ¨               ¨
2631     �                                  ¨               ¨
2632     �         (Smallest)               0            00...01
2633     ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2634                Zero                    0            00...00
2635     ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2636     �         (Smallest)               1            11...11
2637     �                                  ¨               ¨
2638 Negatives                              ¨               ¨
2639     �                                  ¨               ¨
2640     �         (Largest/Indefinite
2641 If this encoding is used as a source operand (as in an integer load or
2642 integer arithmetic instruction), the 80387 interprets it as the largest
2643 negative number representable in the format: -2^(15), -2^(31), or -2^(63).
2644 The 80387 will deliver this encoding to an integer destination in two
2645 cases:
2646     1.  If the result is the largest negative number
2647     2.  As the response to a masked invalid operation exception, in which
2648         case it represents the special value integer indefinite.)    1            00...00
2649     ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2650                                     Word:        ‘‘‘15 bits‘‘‘
2651                                     Short:       ‘‘‘31 bits‘‘‘
2652                                     Long:        ‘‘‘63 bits‘‘‘
2653
2654
2655 Table 3-7. Packed Decimal Encodings
2656
2657
2658
2659                                            ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Magnitude ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“
2660             Class          Sign             digit      digit      digit      digit    . . . digit
2661     ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2662     �       (Largest)      0       0000000  1 0 0 1    1 0 0 1    1 0 0 1    1 0 0 1  . . . 1 0 0 1
2663     �                      ¨          ¨                              ¨
2664     �                      ¨          ¨                              ¨
2665 Positives                  ¨          ¨                              ¨
2666     �       (Smallest)     0       0000000  0 0 0 0    0 0 0 0    0 0 0 0    0 0 0 0  . . . 0 0 0 1
2667     �
2668     �       Zero           0       0000000  0 0 0 0    0 0 0 0    0 0 0 0    0 0 0 0  . . . 0 0 0 0
2669     ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2670     ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2671     �       Zero           1       0000000  0 0 0 0    0 0 0 0    0 0 0 0    0 0 0 0  . . . 0 0 0 0
2672     �
2673     �       (Smallest)     1       0000000  0 0 0 0    0 0 0 0    0 0 0 0    0 0 0 0  . . . 0 0 0 1
2674 Negatives                  ¨          ¨                              ¨
2675     �                      ¨          ¨                              ¨
2676     �                      ¨          ¨                              ¨
2677     �       (Largest)      1       0000000  1 0 0 1    1 0 0 1    1 0 0 1    1 0 0 1  . . . 1 0 0 1
2678     ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2679        Indefinite
2680 The packed decimal indefinite encoding is stored by FBSTP in response to a
2681 masked invalid operation exception. Attempting to load this value via FBLD
2682 produces an undefined result.         1       1111111  1 1 1 1    1 1 1 1    U U U U
2683 UUUU means bit values are undefined and may contain any value   U U U U  . . . U U U U
2684                            ‘‘‘‘ 1 byte ‘‘‘  ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 9 bytes ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2685
2686
2687 Table 3-8. Single and Double Real Encodings
2688
2689
2690                                           Biased      Significand
2691             Class                Sign     Exponent    ff--ff
2692 Integer bit is implied and not stored.
2693
2694
2695
2696
2697    ’‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2698    �      �           Quiet       0       11...11     11...11
2699    �      �                                  ¨           ¨
2700    �      �                                  ¨           ¨
2701    �      �                                  ¨           ¨
2702    �      �                       0       11...11     10...00
2703    �    NaNs        ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2704    �      �           Signaling   0       11...11     01...11
2705    �      �                                  ¨           ¨
2706    �      �                                  ¨           ¨
2707    �      �                                  ¨           ¨
2708    �      �                       0       11...11     00...01
2709    �      ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2710    �                  ý           0       11...11     00...00
2711    �      ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2712    �      �           Normals     0       11...10     11...11
2713    �      �                                  ¨           ¨
2714 Positives �                                  ¨           ¨
2715    �      �                                  ¨           ¨
2716    �      �                       0       00...01     00...00
2717    �      �         ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2718    �    Reals         Denormals   0       00...00     11...11
2719    �      �                                  ¨           ¨
2720    �      �                                  ¨           ¨
2721    �      �                                  ¨           ¨
2722    �      �                       0       00...00     00...01
2723    �      �         ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2724    �      �           Zero        0       00...00     00...00
2725    ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2726    ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2727    �      �           Zero        1       00...00     00...00
2728    �      �         ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2729    �      �           Denormals   1       00...00     00...01
2730    �      �                                  ¨           ¨
2731    �      �                                  ¨           ¨
2732    �    Reals                                ¨           ¨
2733    �      �                       1       00...00     11...11
2734    �      �         ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2735    �      �           Normals     1       00...01     00...00
2736    �      �                                  ¨           ¨
2737    �      �                                  ¨           ¨
2738    �      �                                  ¨           ¨
2739    �      �                       1       11...10     11...11
2740 Negatives ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2741    �                  ý           1       11...11     00...00
2742    �      ’‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2743    �      �         �             1       11...11     00...01
2744    �      �         �                        ¨           ¨
2745    �      �    Signaling                     ¨           ¨
2746    �      �         �                        ¨           ¨
2747    �      �         �             1       11...11     01...11
2748    �    NaNs        –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2749    �      �         � Indefinite  1       11...11     10...00
2750    �      �         �    ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2751    �      �         �                        ¨           ¨
2752    �      �      Quiet                       ¨           ¨
2753    �      �         �                        ¨           ¨
2754    �      �         �             1       11...11     11...11
2755    ”‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2756                              Double:  � ‘‘‘8 bits‘‘ � ‘‘23 bits‘‘ �
2757                              Single:  � ‘‘11 bits‘‘ � ‘‘52 bits‘‘ �
2758
2759
2760 Table 3-9. Extended Real Encodings
2761
2762
2763                                         Biased     Significand
2764             Class              Sign     Exponent   1.ff--ff
2765     ’‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2766     �      �                    0       11...11    1 11..11
2767     �      �     Quiet          ¨          ¨          ¨
2768     �      �                    ¨          ¨          ¨
2769     �      �                    0       11...11    1 10..01
2770     �    NaNs   ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2771     �      �                    0       11...11    1 01..11
2772     �      �     Signaling      ¨          ¨          ¨
2773     �      �                    ¨          ¨          ¨
2774     �      �                    0       11...11    1 00..01
2775     �      ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2776     �         ý                 0       11...11    1 00..00
2777     �      ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2778     �      �                    0       11...10    1 11..11
2779     �      �     Normals        ¨          ¨          ¨
2780     �      �                    ¨          ¨          ¨
2781     �      �                    0       00...01    1 00..00
2782     �      �    ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2783 Positives  �                    0       11...10    0 11..11
2784     �    Reals   Unsupported    ¨          ¨          ¨
2785     �      �     8087 Unnormals ¨          ¨          ¨
2786     �      �                    0       00...01    0 00..00
2787     �      �    ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2788     �      �                    0       00...00    1 11..11
2789     �      �     Pseudo-        ¨          ¨          ¨
2790     �      �       normals      ¨          ¨          ¨
2791     �      �                    0       00...00    1 00..00
2792     �      �    ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2793     �      �                    0       00...00    0 11..11
2794     �      �     Denormals      ¨          ¨          ¨
2795     �      �                    ¨          ¨          ¨
2796     �      �                    0       00...00    0 00..01
2797     �      –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2798     �      �     Zero           0       00...00    000...00
2799     ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2800     ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2801     �      �     Zero           1       00...00    000...00
2802     �      –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2803     �      �                    1       00...00    0 00..01
2804     �      �     Denormals      ¨          ¨          ¨
2805     �      �                    ¨          ¨          ¨
2806     �      �                    1       00...00    0 11..11
2807     �      �    ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2808     �      �                    1       00...00    1 00..00s
2809     �    Reals   Pseudo-        ¨          ¨          ¨
2810     �      �       normals      ¨          ¨          ¨
2811     �      �                    1       00...00    1 11..11
2812     �      �    ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2813     �      �                    1       00...00    0 00..00
2814 Negatives  �     Unsupported    ¨          ¨          ¨
2815     �      �     8087 Unnormals ¨          ¨          ¨
2816     �      �                    1       11...10    0 11..11
2817     �      �    ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2818     �      �                    1       00...01    1 00..00
2819     �      �     Normals        ¨          ¨          ¨
2820     �      �                    ¨          ¨          ¨
2821     �      �                    1       11...10    1 11..11
2822     �      ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2823     �         ý                 1       11...11    1 00..00
2824     �      ’‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2825     �      �   �                1       11...11    1 00..01
2826     �      �  Signaling         ¨          ¨          ¨
2827     �      �   �                ¨          ¨          ¨
2828     �      �   �                1       11...11    1 01..11
2829     �      �   –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2830     �     NaNs �    Indefinite  1       11...11    110...00
2831     �      �   �     ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2832     �      �   �                1       11...11    1 10..00
2833     �      �  Quiet             ¨          ¨          ¨
2834     �      �   �                ¨          ¨          ¨
2835     �      �   �                1       11...11    1 11..11
2836     ”‘‘‘‘‘‘™‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2837                                      �‘‘15 bits‘‘�‘‘64 bits‘‘�
2838
2839
2840 3.2  Numeric Exceptions
2841
2842 The 80387 can recognize six classes of numeric exception conditions while
2843 executing numeric instructions:
2844
2845   1.  I‘‘ Invalid operation
2846           Ž  Stack fault
2847           Ž  IEEE standard invalid operation
2848   2.  Z‘‘ Divide-by-zero
2849   3.  D‘‘ Denormalized operand
2850   4.  O‘‘ Numeric overflow
2851   5.  U‘‘ Numeric underflow
2852   6.  P‘‘ Inexact result (precision)
2853
2854
2855 3.2.1  Handling Numeric Exceptions
2856
2857 When numeric exceptions occur, the NPX takes one of two possible courses of
2858 action:
2859
2860   Ž  The NPX can itself handle the exception, producing the most reasonable
2861      result and allowing numeric program execution to continue undisturbed.
2862
2863   Ž  A software exception handler can be invoked by the CPU to handle the
2864      exception.
2865
2866 Each of the six exception conditions described above has a corresponding
2867 flag bit in the 80387 status word and a mask bit in the 80387 control word.
2868 If an exception is masked (the corresponding mask bit in the control
2869 word = 1), the 80387 takes an appropriate default action and continues with
2870 the computation. If the exception is unmasked (mask = 0), the 80387 asserts
2871 the ERROR# output to the 80386 to signal the exception and invoke a software
2872 exception handler.
2873
2874 Note that when exceptions are masked, the NPX may detect multiple
2875 exceptions in a single instruction, because it continues executing the
2876 instruction after performing its masked response. For example, the 80387
2877 could detect a denormalized operand, perform its masked response to this
2878 exception, and then detect an underflow.
2879
2880
2881 3.2.1.1  Automatic Exception Handling
2882
2883 The 80387 NPX has a default fix-up activity for every possible exception
2884 condition it may encounter. These masked-exception responses are designed to
2885 be safe and are generally acceptable for most numeric applications.
2886
2887 As an example of how even severe exceptions can be handled safely and
2888 automatically using the NPX's default exception responses, consider a
2889 calculation of the parallel resistance of several values using only the
2890 standard formula (Figure 3-3). If R{1} becomes zero, the circuit resistance
2891 becomes zero. With the divide-by-zero and precision exceptions masked, the
2892 80387 NPX will produce the correct result.
2893
2894 By masking or unmasking specific numeric exceptions in the NPX control
2895 word, NPX programmers can delegate responsibility for most exceptions to the
2896 NPX, reserving the most severe exceptions for programmed exception handlers.
2897 Exception-handling software is often difficult to write, and the NPX's
2898 masked responses have been tailored to deliver the most reasonable result
2899 for each condition. For the majority of applications, masking all
2900 exceptions other than invalid-operation yields satisfactory results with the
2901 least programming effort. An invalid-operation exception normally indicates
2902 a program error that must be corrected; this exception should not normally
2903 be masked.
2904
2905 The exception flags in the NPX status word provide a cumulative record of
2906 exceptions that have occurred since these flags were last cleared. Once set,
2907 these flags can be cleared only by executing the FCLEX (clear exceptions)
2908 instruction, by reinitializing the NPX, or by overwriting the flags with an
2909 FRSTOR or FLDENV instruction. This allows a programmer to mask all
2910 exceptions (except invalid operation), run a calculation, and then inspect
2911 the status word to see if any exceptions were detected at any point in the
2912 calculation.
2913
2914
2915 Figure 3-3.  Arithmetic Example Using Infinity
2916
2917                            ‘‘‘˜‘‘‘‘‘‘˜‘‘‘‘‘‘“
2918                               �      �      �
2919                               �      �      �
2920                               �      �      �
2921                              R{1}   R{2}   R{3}
2922                               �      �      �
2923                               �      �      �
2924                               �      �      �
2925                            ‘‘‘™‘‘‘‘‘‘™‘‘‘‘‘‘•
2926
2927                                                 1
2928           EQUIVALENT RESISTANCE = ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
2929                                    1/R{1}  +  1/R{2}  +  1/R{3}
2930
2931
2932 3.2.1.2  Software Exception Handling
2933
2934 If the NPX encounters an unmasked exception condition, it signals the
2935 exception to the 80386 CPU using the ERROR# status line between the two
2936 processors.
2937
2938 The next time the 80386 CPU encounters a WAIT or ESC instruction in its
2939 instruction stream, the 80386 will detect the active condition of the ERROR#
2940 status line and automatically trap to an exception response routine using
2941 interrupt #16, the "processor extension error" exception.
2942
2943 This exception response routine is normally a part of the systems software.
2944 Typical exception responses may include:
2945
2946   Ž  Incrementing an exception counter for later display or printing
2947
2948   Ž  Printing or displaying diagnostic information (e.g., the 80387
2949      environment andregisters)
2950
2951   Ž  Aborting further execution
2952
2953   Ž  Using the exception pointers to build an instruction that will run
2954      without exception and executing it
2955
2956 For 80386 systems having systems software support for the 80387 NPX,
2957 applications programmers should consult the operating system's reference
2958 manuals for the appropriate system response to NPX exceptions. For systems
2959 programmers, specific details on writing software exception handlers are
2960 included in Chapter 6.
2961
2962
2963 3.2.2  Invalid Operation
2964
2965 This exception may occur in response to two general classes of operations:
2966
2967   1.  Stack operations
2968   2.  Arithmetic operations
2969
2970 The stack flag (SF) of the status word indicates which class of operation
2971 caused the exception. When SF is 1 a stack operation has resulted in stack
2972 overflow or underflow; when SF is 0, an arithmetic instruction has
2973 encountered an invalid operand.
2974
2975
2976 3.2.2.1  Stack Exception
2977
2978 When SF is 1, indicating a stack operation, the O/U# bit of the condition
2979 code (bit C{1}) distinguishes between stack overflow and underflow as
2980 follows:
2981
2982 O/U# = 1   Stack overflow‘‘ an instruction attempted to push down a
2983            nonempty stack location.
2984
2985 O/U# = 0   Stack underflow‘‘ an instruction attempted to read an
2986            operand from an empty stack location.
2987
2988 When the invalid-operation exception is masked, the 80387 returns the QNaN
2989 indefinite. This value overwrites the destination register, destroying
2990 its original contents.
2991
2992 When the invalid-operation exception is not masked, the 80386 exception
2993 "processor extension error" is triggered. TOP is not changed, and the source
2994 operands remain unaffected.
2995
2996
2997 3.2.2.2  Invalid Arithmetic Operation
2998
2999 This class includes the invalid operations defined in IEEE Std 754. The
3000 80387 reports an invalid operation in any of the cases shown in Table 3-10.
3001 Also shown in this table are the 80387's responses when the invalid
3002 exception is masked. When unmasked, the 80386 exception "processor extension
3003 error" is triggered, and the operands remain unaltered. An invalid operation
3004 generally indicates a program error.
3005
3006
3007 Table 3-10.  Masked Responses to Invalid Operations
3008
3009
3010 Condition                           Masked Response
3011
3012 Any arithmetic operation            Return the QNaN indefinite.
3013 on an unsupported format.
3014
3015 Any arithmetic operation            Return a QNaN (refer to the section
3016 on a signaling NaN.                 "Rules for Generating QNaNs").
3017
3018 Compare and test operations:        Set condition codes "not comparable."
3019 one or both operands is a NaN.
3020
3021 Addition of opposite-signed         Return the QNaN indefinite.
3022 infinities or subtraction of
3023 like-signed infinities.
3024
3025 Multiplication: ý * 0; or 0 * ý.    Return the QNaN indefinite.
3026
3027 Division: ý ÷ ý; or 0 ÷ 0.          Return the QNaN indefinite.
3028
3029 Remainder instructions FPREM,       Return the QNaN indefinite; set C{2}.
3030 FPREM1 when modulus (divisor)
3031 is zero or dividend is ý.
3032
3033 Trigonometric instructions FCOS,    Return the QNaN indefinite; set  C{2}.
3034 FPTAN, FSIN, FSINCOS when
3035 argument is ý.
3036
3037 FSQRT of negative operand (except   Return the QNaN indefinite.
3038 FSQRT (-0) = -0), FYL2X of
3039 negative operand (except FYL2X
3040 (-0) = -ý), FYL2XP1 of operand
3041 more negative than -1.
3042
3043 FIST(P) instructions when source    Store integer indefinite.
3044 register is empty, a NaN, ý,
3045 or exceeds representable range
3046 of destination.
3047
3048 FBSTP instruction when source       Store packed decimal indefinite.
3049 register is empty, a NaN, ý, or
3050 exceeds 18 decimal digits.
3051
3052 FXCH instruction when one or        Change empty registers to the QNaN
3053 both registers are tagged empty.    indefinite and then perform exchange.
3054
3055
3056 3.2.3  Division by Zero
3057
3058 If an instruction attempts to divide a finite nonzero operand by zero, the
3059 80387 will report a zero-divide exception. This is possible for
3060 F(I)DIV(R)(P) as well as the other instructions that perform division
3061 internally: FYL2X and FXTRACT. The masked response for FDIV and FYL2X is to
3062 return an infinity signed with the exclusive OR of the signs of the
3063 operands. For FXTRACT, ST(1) is set to -ý; ST is set to zero with the same
3064 sign as the original operand. If the divide-by-zero exception is unmasked,
3065 the 80386 exception "processor extension error" is triggered; the operands
3066 remain unaltered.
3067
3068
3069 3.2.4  Denormal Operand
3070
3071 If an arithmetic instruction attempts to operate on a denormal operand, the
3072 NPX reports the denormal-operand exception. Denormal operands may have
3073 reduced significance due to lost low-order bits, therefore it may be
3074 advisable in certain applications to preclude operations on these operands.
3075 This can be accomplished by an exception handler that responds to unmasked
3076 denormal exceptions. Most users will mask this exception so that
3077 computation may proceed; any loss of accuracy will be analyzed by the user
3078 when the final result is delivered.
3079
3080 When this exception is masked, the 80387 sets the D-bit in the status word,
3081 then proceeds with the instruction. Gradual underflow and denormal numbers
3082 as handled on the 80387 will produce results at least as good as, and often
3083 better than what could be obtained from a machine that flushes underflows to
3084 zero. In fact, a denormal operand in single- or double-precision format will
3085 be normalized to the extended-real format when loaded into the 80387.
3086 Subsequent operations will benefit from the additional precision of the
3087 extended-real format used internally.
3088
3089 When this exception is not masked, the D-bit is set and the exception
3090 handler is invoked. The operands are not changed by the instruction and are
3091 available for inspection by the exception handler.
3092
3093 If an 8087/80287 program uses the denormal exception to automatically
3094 normalize denormal operands, then that program can run on an 80387 by
3095 masking the denormal exception. The 8087/80287 denormal exception handler
3096 would not be used by the 80387 in this case. A numerics program runs faster
3097 when the 80387 performs normalization of denormal operands. A program can
3098 detect at run-time whether it is running on an 80387 or 8087/80287 and
3099 disable the denormal exception when an 80387 is used. The following code
3100 sequence is recommended to distinguish between an 80387 and an 8087/80287.
3101
3102       FINIT              ; Use default infinity mode:
3103                          ;  projective for 8087/80287,
3104                          ;  affine for 80387
3105       FLD1               ; Generate infinty
3106       FLDZ
3107       FDIV
3108       FLD    ST
3109                          ; Form negative infinity
3110       FCHS
3111       FCOMPP             ; Compare +infinity with -infinity
3112       FSTSW  temp        ; 8087/80287 will say they are equal
3113       MOV    AX, temp
3114       SAHF
3115       JNZ    Using_80387
3116
3117 The denormal-operand exception of the 80387 permits emulation of arithmetic
3118 on unnormal operands as provided by the 8087/80287. The standard does not
3119 require the denormal exception nor does it recognize the unnormal data type.
3120
3121
3122 3.2.5  Numeric Overflow and Underflow
3123
3124 If the exponent of a numeric result is too large for the destination real
3125 format, the 80387 signals a numeric overflow. Conversely, if the exponent of
3126 a result is too small to be represented in the destination format, a numeric
3127 underflow is signaled. If either of these exceptions occur, the result of
3128 the operation is outside the range of the destination real format.
3129
3130 Typical algorithms are most likely to produce extremely large and small
3131 numbers in the calculation of intermediate, rather than final, results.
3132 Because of the great range of the extended-precision format (recommended as
3133 the destination format for intermediates), overflow and underflow are
3134 relatively rare events in most 80387 applications.
3135
3136
3137 3.2.5.1  Overflow
3138
3139 The overflow exception can occur whenever the rounded true result would
3140 exceed in magnitude the largest finite number in the destination format. The
3141 exception can occur in the execution of most of the arithmetic instructions
3142 and in some of the conversion instructions; namely, FST(P), F(I)ADD(P),
3143 F(I)SUB(R)(P), F(I)MUL(P), FDIV(R)(P), FSCALE, FYL2X, and FYL2XP1.
3144
3145 The response to an overflow condition depends on whether the overflow
3146 exception is masked:
3147
3148   Ž  Overflow exception masked. The value returned depends on the rounding
3149      mode as Table 3-11 illustrates.
3150
3151   Ž  Overflow exception not masked. The unmasked response depends on
3152      whether the instruction is supposed to store the result on the stack
3153      or in memory:
3154
3155      ‘‘ Destination is the stack. The true result is divided by 2^(24,576)
3156         and rounded. (The bias 24,576 is equal to 3 * 2^(13).) The
3157         significand is rounded to the appropriate precision (according to
3158         the precision control (PC) bit of the control word, for those
3159         instructions controlled by PC, otherwise to extended precision).
3160         The roundup bit (C{1}) of the status word is set if the
3161         significand was rounded upward.
3162
3163         The biasing of the exponent by 24,576 normally translates the
3164         number as nearly as possible to the middle of the exponent range
3165         so that, if desired, it can be used in subsequent scaled
3166         operations with less risk of causing further exceptions. With the
3167         instruction FSCALE, however, it can happen that the result is too
3168         large and overflows even after biasing. In this case, the unmasked
3169         response is exactly the same as the masked round-to-nearest
3170         response, namely ± infinity. The intention of this feature is to
3171         ensure the trap handler will discover that a translation of the
3172         exponent by -24574 would not work correctly without obliging the
3173         programmer of Decimal-to-Binary or Exponential functions to
3174         determine which trap handler, if any, should be invoked.
3175
3176      ‘‘ Destination is memory (this can occur only with the store
3177         instructions). No result is stored in memory. Instead, the operand
3178         is left intact in the stack. Because the data in the stack is in
3179         extended-precision format, the exception handler has the option
3180         either of reexecuting the store instruction after proper
3181         adjustment of the operand or of rounding the significand on the
3182         stack to the destination's precision as the standard requires. The
3183         exception handler should ultimately store a value into the
3184         destination location in memory if the program is to continue.
3185
3186
3187 Table 3-11.  Masked Overflow Results
3188
3189 Rounding              Sign of
3190 Mode                True Result         Result
3191
3192 To nearest               +               +ý
3193                          -               -ý
3194 Toward -ý                +               Largest finite positive number
3195                          -               -ý
3196 Toward +ý                +               +ý
3197                          -               Largest finite negative number
3198 Toward zero              +               Largest finite positive number
3199                          -               Largest finite negative number
3200
3201
3202 3.2.5.2  Underflow
3203
3204 Underflow can occur in the execution of the instructions FST(P), FADD(P),
3205 FSUB(RP), FMUL(P), F(I)DIV(RP), FSCALE, FPREM(1), FPTAN, FSIN, FCOS,
3206 FSINCOS, FPATAN, F2XM1, FYL2X, and FYL2XP1.
3207
3208 Two related events contribute to underflow:
3209
3210   1.  Creation of a tiny result which, because it is so small, may cause
3211       some other exception later (such as overflow upon division).
3212
3213   2.  Creation of an inexact result; i.e. the delivered result differs from
3214       what would have been computed were both the exponent range and
3215       precision unbounded.
3216
3217 Which of these events triggers the underflow exception depends on whether
3218 the underflow exception is masked:
3219
3220   1.  Underflow exception masked. The underflow exception is signaled when
3221       the result is both tiny and inexact.
3222
3223   2.  Underflow exception not masked. The underflow exception is signaled
3224       when the result is tiny, regardless of inexactness.
3225
3226 The response to an underflow exception also depends on whether the
3227 exception is masked:
3228
3229   1.  Masked response. The result is denormal or zero. The precision
3230       exception is also triggered.
3231
3232   2.  Unmasked response. The unmasked response depends on whether the
3233       instruction is supposed to store the result on the stack or in memory:
3234
3235       Ž  Destination is the stack. The true result is multiplied by
3236          2^(24,576) and rounded. (The bias 24,576 is equal to 3 * 2^(13).)
3237          The significand is rounded to the appropriate precision (according
3238          to the precision control (PC) bit of the control word, for those
3239          instructions controlled by PC, otherwise to extended precision).
3240          The roundup bit (C{1}) of the status word is set if the significand
3241          was rounded upward.
3242
3243          The biasing of the exponent by 24,576 normally translates the
3244          number as nearly as possible to the middle of the exponent range so
3245          that, if desired, it can be used in subsequent scaled operations
3246          with less risk of causing further exceptions. With the instruction
3247          FSCALE, however, it can happen that the result is too tiny and
3248          underflows even after biasing. In this case, the unmasked response
3249          is exactly the same as the masked round-to-nearest response, namely
3250          ±0. The intention of this feature is to ensure the trap handler
3251          will discover that a translation by +24576 would not work correctly
3252          without obliging the programmer of Decimal-to-Binary or Exponential
3253          functions to determine which trap handler, if any, should be
3254          invoked.
3255
3256       Ž  Destination is memory (this can occur only with the store
3257          instructions). No result is stored in memory. Instead, the operand
3258          is left intact in the stack. Because the data in the stack is in
3259          extended-precision format, the exception handler has the option
3260          either of reexecuting the store instruction after proper adjustment
3261          of the operand or of rounding the significand on the stack to the
3262          destination's precision as the standard requires. The exception
3263          handler should ultimately store a value into the destination
3264          location in memory if the program is to continue.
3265
3266
3267 3.2.6  Inexact (Precision)
3268
3269 This exception condition occurs if the result of an operation is not
3270 exactly representable in the destination format. For example, the fraction
3271 1/3 cannot be precisely represented in binary form. This exception occurs
3272 frequently and indicates that some (generally acceptable) accuracy has been
3273 lost.
3274
3275 All the transcendental instructions are inexact by definition; they always
3276 cause the inexact exception.
3277
3278 The C{1} (roundup) bit of the status word indicates whether the inexact
3279 result was rounded up (C{1} = 1) or chopped (C{1} = 0).
3280
3281 The inexact exception accompanies the underflow exception when there is
3282 also a loss of accuracy. When underflow is masked, the underflow exception
3283 is signaled only when there is a loss of accuracy; therefore the precision
3284 flag is always set as well. When underflow is unmasked, there may or may not
3285 have been a loss of accuracy; the precision bit indicates which is the case.
3286
3287 This exception is provided for applications that need to perform exact
3288 arithmetic only. Most applications will mask this exception. The 80387
3289 delivers the rounded or over/underflowed result to the destination,
3290 regardless of whether a trap occurs.
3291
3292
3293 3.2.7  Exception Priority
3294
3295 The 80387 deals with exceptions according to a predetermined precedence.
3296 Precedence in exception handling means that higher-priority exceptions are
3297 flagged and results are delivered according to the requirements of that
3298 exception. Lower-priority exceptions may not be flagged even if they occur.
3299 For example, dividing an SNaN by zero causes an invalid-operand exception
3300 (due to the SNaN) and not a zero-divide exception; the masked result is the
3301 QNaN real indefinite, not ý. A denormal or inexact (precision) exception,
3302 however, can accompany a numeric underflow or overflow exception.
3303
3304 The exception precedence is as follows:
3305
3306   1.  Invalid operation exception, subdivided as follows:
3307
3308       a. Stack underflow.
3309       b. Stack overflow.
3310       c. Operand of unsupported format.
3311       d. SNaN operand.
3312
3313   2.  QNaN operand. Though this is not an exception, if one operand is a
3314       QNaN, dealing with it has precedence over lower-priority exceptions.
3315       For example, a QNaN divided by zero results in a QNaN, not a
3316       zero-divide exception.
3317
3318   3.  Any other invalid-operation exception not mentioned above or zero
3319       divide.
3320
3321   4.  Denormal operand. If masked, then instruction execution continues,
3322       and a lower-priority exception can occur as well.
3323
3324   5.  Numeric overflow and underflow. Inexact result (precision) can be
3325       flagged as well.
3326
3327   6.  Inexact result (precision).
3328
3329
3330 3.2.8  Standard Underflow/Overflow Exception Handler
3331
3332 As long as the underflow and overflow exceptions are masked, no additional
3333 software is required to cause the output of the 80387 to conform to the
3334 requirements of IEEE Std 754. When unmasked, these exceptions give the
3335 exception handler an additional option in the case of store instructions. No
3336 result is stored in memory; instead, the operand is left intact on the
3337 stack. The handler may round the significand of the operand on the stack to
3338 the destination's precision as the standard requires, or it may adjust the
3339 operand and reexecute the faulting instruction.
3340
3341
3342
3343 Chapter 4  The 80387 Instruction Set
3344
3345 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
3346
3347 This chapter describes the operation of all 80387 instructions. Within this
3348 section, the instructions are divided into six functional classes:
3349
3350   Ž  Data Transfer instructions
3351   Ž  Nontranscendental instructions
3352   Ž  Comparison instructions
3353   Ž  Transcendental instructions
3354   Ž  Constant instructions
3355   Ž  Processor Control instructions
3356
3357 Throughout this chapter, the instruction set is described as it appears to
3358 the ASM386 programmer who is coding a program. Not included in this chapter
3359 are details of instruction format, encoding, and execution times. This
3360 detailed information may be found in Appendix A. Refer also to Appendix B
3361 for a summary of the exceptions caused by each instruction.
3362
3363
3364 4.1  Compatibility With the 80287 and 8087
3365
3366 The instruction set for the 80387 NPX is largely the same as that for the
3367 80287 NPX (used with 80286 systems) and that for the 8087 NPX (used with
3368 8086 and 8088 systems). Most object programs generated for the 80287 or 8087
3369 will execute without change on the 80387. Several instructions are new to
3370 the 80387, and several 80287 and 8087 instructions perform no useful
3371 function on the 80387. Appendix C and Appendix D give details of these
3372 instruction set differences.
3373
3374
3375 4.2  Numeric Operands
3376
3377 The typical NPX instruction accepts one or two operands as inputs, operates
3378 on these, and produces a result as an output. An operand is most often the
3379 contents of a register or of a memory location. The operands of some
3380 instructions are predefined; for example, FSQRT always takes the square root
3381 of the number in the top NPX stack element. Others allow, or require, the
3382 programmer to explicitly code the operand(s) along with the instruction
3383 mnemonic. Still others accept one explicit operand and one implicit
3384 operand, which is usually the top NPX stack element. All 80387 instructions
3385 that have a data operand use ST as one operand or as the only operand.
3386
3387 Whether supplied by the programmer or utilized automatically, the two basic
3388 types of operands are sources and destinations. A source operand simply
3389 supplies one of the inputs to an instruction; it is not altered by the
3390 instruction. Even when an instruction converts the source operand from one
3391 format to another (e.g., real to integer), the conversion is actually
3392 performed in an internal work area to avoid altering the source operand. A
3393 destination operand may also provide an input to an instruction. It is
3394 distinguished from a source operand, however, because its content may be
3395 altered when it receives the result produced by the operation; that is, the
3396 destination is replaced by the result.
3397
3398 Many instructions allow their operands to be coded in more than one way.
3399 For example, FADD (add real) may be written without operands, with only a
3400 source or with a destination and a source. The instruction descriptions in
3401 this section employ the simple convention of separating alternative operand
3402 forms with slashes; the slashes, however, are not coded. Consecutive slashes
3403 indicate an option of no explicit operands. The operands for FADD are thus
3404 described as
3405
3406   //source/destination, source
3407
3408 This means that FADD may be written in any of three ways:
3409
3410 Written Form               Action
3411
3412 FADD                       Add ST to ST(1), put result in ST(1), then pop ST
3413 FADD source                Add source to ST(0)
3414 FADD destination, source   Add source to destination
3415
3416 The assembler can allow the same instruction to be specified in different
3417 ways; for example:
3418
3419 FADD = FADDP ST(1), ST
3420 FADD ST(1) = FADD ST, ST(1)
3421
3422 When reading this section, it is important to bear in mind that memory
3423 operands may be coded with any of the CPU's memory addressing methods
3424 provided by the ModR/M byte. To review these methods (BASE + (INDEX * SCALE)
3425 + DISPLACEMENT) refer to the 80386 Programmer's Reference Manual.
3426 Chapter 5 also provides several addressing mode examples.
3427
3428
3429 4.3  Data Transfer Instructions
3430
3431 These instructions (summarized in Table 4-1) move operands among elements
3432 of the register stack, and between the stack top and memory. Any of the
3433 seven data types can be converted to extended real and loaded (pushed) onto
3434 the stack in a single operation; they can be stored to memory in the same
3435 manner. The data transfer instructions automatically update the 80387 tag
3436 word to reflect whether the register is empty or full following the
3437 instruction.
3438
3439 Table 4-1.  Data Transfer Instructions
3440
3441 Real Transfers
3442 FLD          Load Real
3443 FST          Store real
3444 FSTP         Store real and pop
3445 FXCH         Exchange registers
3446 Integer Transfers
3447 FILD         Integer load
3448 FIST         Integer store
3449 FISTP        Integer store and pop
3450 Packed Decimal Transfers
3451 FBLD         Packed decimal (BCD) load
3452 FBSTP        Packed decimal (BCD) store and pop
3453
3454
3455 4.3.1  FLD source
3456
3457 FLD (load real) loads (pushes) the source operand onto the top of the
3458 register stack. This is done by decrementing the stack pointer by one and
3459 then copying the content of the source to the new stack top. ST(7) must be
3460 empty to avoid causing an invalid-operation exception. The new stack top is
3461 tagged nonempty. The source may be a register on the stack (ST(i)) or any of
3462 the real data types in memory. If the source is a register, the register
3463 number used is that before TOP is decremented by the instruction. Coding FLD
3464 ST(0) duplicates the stack top. Single and double real source operands are
3465 converted to extended real automatically. Loading an extended real operand
3466 does not require conversion; therefore, the I and D exceptions do not occur
3467 in this case.
3468
3469
3470 4.3.2  FST destination
3471
3472 FST (store real) copies the NPX stack top to the destination, which
3473 may be another register on the stack or a single or double (but not
3474 extended-precision) memory operand. If the destination is single or double
3475 real, the copy of the significand is rounded to the width of the destination
3476 according to the RC field of the control word, and the copy of the exponent
3477 is converted to the width and bias of the destination format. The
3478 over/underflow condition is checked for as well.
3479
3480 If, however, the stack top contains zero, ±ý, or a NaN, then the stack
3481 top's significand is not rounded but is chopped (on the right) to fit the
3482 destination. Neither is the exponent converted, rather it also is chopped on
3483 the right and transferred "as is". This preserves the value's identification
3484 as ý or a NaN (exponent all ones) so that it can be properly loaded and used
3485 later in the program if desired.
3486
3487 Note that the 80387 does not signal the invalid-operation exception when
3488 the destination is a nonempty stack element.
3489
3490
3491 4.3.3  FSTP destination
3492
3493 FSTP (store real and pop) operates identically to FST except that the NPX
3494 stack is popped following the transfer. This is done by tagging the top
3495 stack element empty and then incrementing TOP. FSTP also permits storing to
3496 an extended-precision real memory variable, whereas FST does not. If the
3497 source operand is a register, the register number used is that before TOP is
3498 incremented by the instruction. Coding FSTP ST(0) is equivalent to popping
3499 the stack with no data transfer.
3500
3501
3502 4.3.4  FXCH //destination
3503
3504 FXCH (exchange registers) swaps the contents of the destination and the
3505 stack top registers. If the destination is not coded explicitly, ST(1) is
3506 used. Many 80387 instructions operate only on the stack top; FXCH provides a
3507 simple means of effectively using these instructions on lower stack
3508 elements. For example, the following sequence takes the square root of the
3509 third register from the top (assuming that ST is nonempty):
3510
3511 FXCH ST(3)
3512 FSQRT
3513 FXCH ST(3)
3514
3515
3516 4.3.5  FILD source
3517
3518 FILD (integer load) converts the source memory operand from its binary
3519 integer format (word, short, or long) to extended real and pushes the result
3520 onto the NPX stack. ST(7) must be empty to avoid causing an exception. The
3521 (new) stack top is tagged nonempty. FILD is an exact operation; the source
3522 is loaded with no rounding error.
3523
3524
3525 4.3.6  FIST destination
3526
3527 FIST (integer store) stores the content of the stack top to an integer
3528 according to the RC field (rounding control) of the control word and
3529 transfers the result to the destination, leaving the stack top unchanged.
3530 The destination may define a word or short integer variable. Negative zero
3531 is stored in the same encoding as positive zero: 0000...00.
3532
3533
3534 4.3.7  FISTP destination
3535
3536 FISTP (integer and pop) operates like FIST except that it also pops the NPX
3537 stack following the transfer. The destination may be any of the binary
3538 integer data types.
3539
3540
3541 4.3.8  FBLD source
3542
3543 FBLD (packed decimal (BCD) load) converts the content of the source operand
3544 from packed decimal to extended real and pushes the result onto the NPX
3545 stack. ST(7) must be empty to avoid causing an exception. The sign of the
3546 source is preserved, including the case where the value is negative zero.
3547 FBLD is an exact operation; the source is loaded with no rounding error.
3548
3549 The packed decimal digits of the source are assumed to be in the range 0-9.
3550 The instruction does not check for invalid digits (A-FH), and the result of
3551 attempting to load an invalid encoding is undefined.
3552
3553
3554 4.3.9  FBSTP destination
3555
3556 FBSTP (packed decimal (BCD) store and pop) converts the content of the
3557 stack top to a packed decimal integer, stores the result at the destination
3558 in memory, and pops the stack. FBSTP rounds a nonintegral value according to
3559 the RC (rounding control) field of the control word.
3560
3561
3562 4.4  Nontranscendental Instructions
3563
3564 The 80387's nontranscendental instruction set (Table 4-2) provides a wealth
3565 of variations on the basic add, subtract, multiply, and divide operations,
3566 and a number of other useful functions. These range from a simple absolute
3567 value to a square root instruction that executes faster than ordinary
3568 division; 80387 programmers no longer need to spend valuable time
3569 eliminating square roots from algorithms because they run too slowly. Other
3570 nontranscendental instructions perform exact modulo division, round real
3571 numbers to integers, and scale values by powers of two.
3572
3573 The 80387's basic nontranscendental instructions (addition, subtraction,
3574 multiplication, and division) are designed to encourage the development of
3575 very efficient algorithms. In particular, they allow the programmer to
3576 reference memory as easily as the NPX register stack.
3577
3578 Table 4-3 summarizes the available operation/operand forms that are
3579 provided for basic arithmetic. In addition to the four normal operations,
3580 two "reversed" instructions make subtraction and division "symmetrical" like
3581 addition and multiplication. The variety of instruction and operand forms
3582 give the programmer unusual flexibility:
3583
3584   Ž  Operands may be located in registers or memory.
3585
3586   Ž  Results may be deposited in a choice of registers.
3587
3588   Ž  Operands may be a variety of NPX data types: extended real, double
3589      real, single real, short integer or word integer, with automatic
3590      conversion to extended real performed by the 80387.
3591
3592 Five basic instruction forms may be used across all six operations, as
3593 shown in Table 4-3. The classical stack form may be used to make the 80387
3594 operate like a classical stack machine. No operands are coded in this form,
3595 only the instruction mnemonic. The NPX picks the source operand from the
3596 stack top and the destination from the next stack element. It then pops the
3597 stack, performs the operation, and returns the result to the new stack top,
3598 effectively replacing the operands by the result.
3599
3600 The register form is a generalization of the classical stack form; the
3601 programmer specifies the stack top as one operand and any register on the
3602 stack as the other operand. Coding the stack top as the destination provides
3603 a convenient way to access a constant, held elsewhere in the stack, from the
3604 stack top. The destination need not always be ST, however. All two operand
3605 instructions allow use of another register as the destination. This coding
3606 (ST is the source operand) allows, for example, adding the stack top into a
3607 register used as an accumulator.
3608
3609 Often the operand in the stack top is needed for one operation but then is
3610 of no further use in the computation. The register pop form can be used to
3611 pick up the stack top as the source operand, and then discard it by popping
3612 the stack. Coding operands of ST(1), ST with a register pop mnemonic is
3613 equivalent to a classical stack operation: the top is popped and the result
3614 is left at the new top.
3615
3616 The two memory forms increase the flexibility of the 80387's
3617 nontranscendental instructions. They permit a real number or a binary
3618 integer in memory to be used directly as a source operand. This is useful in
3619 situations where operands are not used frequently enough to justify holding
3620 them in registers. Note that any memory addressing method may be used to
3621 define these operands, so they may be elements in arrays, structures, or
3622 other data organizations, as well as simple scalars.
3623
3624 The six basic operations are discussed further in the next paragraphs, and
3625 descriptions of the remaining seven operations follow.
3626
3627
3628 Table 4-2.  Nontranscendental Instructions
3629
3630 Addition
3631 FADD              Add real
3632 FADDP             Add real and pop
3633 FIADD             Integer add
3634
3635 Subtraction
3636 FSUB              Subtract real
3637 FSUBP             Subtract real and pop
3638 FISUB             Integer subtract
3639 FSUBR             Subtract real reversed
3640 FSUBRP            Subtract real reversed and pop
3641 FISUBR            Integer subtract reversed
3642
3643 Multiplication
3644 FMUL              Multiply real
3645 FMULP             Multiply real and pop
3646 FIMUL             Integer multiply
3647
3648 Division
3649 FDIV              Divide real
3650 FDIVP             Divide real and pop
3651 FIDIV             Integer divide
3652 FDIVR             Divide real reversed
3653 FDIVRP            Divide real reversed and pop
3654 FIDIVR            Integer divide reversed
3655
3656 Other Operations
3657 FSQRT             Square root
3658 FSCALE            Scale
3659 FPREM             Partial remainder
3660 FPREM1            IEEE standard partial remainder
3661 FRNDINT           Round to integer
3662 FXTRACT           Extract exponent and significand
3663 FABS              Absolute value
3664 FCHS              Change sign
3665
3666
3667 Table 4-3.  Basic Nontranscendental Instructions and Operands
3668
3669 Instruction Form           Mnemonic  Operand Forms
3670                            Form      destination, source    ASM386 Example
3671
3672 Classical stack            Fop       [ST(1), ST]            FADD
3673 Classical stack, extra pop FopP      [ST(1), ST]            FADDP
3674 Register                   Fop       ST(i), ST or ST, ST(i) FSUB   ST, ST(3)
3675 Register pop               FopP      ST(i), ST              FMULP  ST(2), ST
3676 Real memory                Fop       [ST,] single/double    FDIV   AZIMUTH
3677 Integer memory             FIop      [ST,] word-integer/    FIDIV  PULSES
3678                                            short-integer
3679
3680 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
3681 NOTES
3682   Brackets ([]) surround implicit operands; these are not coded, and are
3683   shown here for information only.
3684
3685   op=              ADD    destination \e destination + source
3686                    SUB    destination \e destination - source
3687                    SUBR   destination \e source - destination
3688                    MUL    destination \e destination * source
3689                    DIV    destination \e destination ÷ source
3690                    DIVR   destination \e source ÷ destination
3691 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
3692
3693
3694 4.4.1  Addition
3695
3696 FADD     //source/destination,source
3697 FADDP    //destination,source
3698 FIADD    source
3699
3700 The addition instructions (add real, add real and pop, integer add) add the
3701 source and destination operands and return the sum to the destination. The
3702 operand at the stack top may be doubled by coding:
3703
3704 FADD ST, ST(0)
3705
3706 If the source operand is in memory, conversion of an integer, a single
3707 real, or a double real operand to extended real is performed automatically.
3708
3709
3710 4.4.2  Normal Subtraction
3711
3712 FSUB     //source/destination,source
3713 FSUBP    //destination,source
3714 FISUB    source
3715
3716 The normal subtraction instructions (subtract real, subtract real and pop,
3717 integer subtract) subtract the source operand from the destination and
3718 return the difference to the destination.
3719
3720
3721 4.4.3  Reversed Subtraction
3722
3723 FSUBR    //source/destination,source
3724 FSUBRP   //destination,source
3725 FISUBR   source
3726
3727 The reversed subtraction instructions (subtract real reversed, subtract
3728 real reversed and pop, integer subtract reversed) subtract the destination
3729 from the source and return the difference to the destination. For example,
3730 FSUBR ST, ST(1) means subtract ST from ST(1) and leave the result in ST.
3731
3732
3733 4.4.4  Multiplication
3734
3735 FMUL     //source/destination,source
3736 FMULP    //destination,source
3737 FIMUL    source
3738
3739 The multiplication instructions (multiply real, multiply real and pop,
3740 integer multiply) multiply the source and destination operands and return
3741 the product to the destination. Coding FMUL ST, ST(0) squares the content of
3742 the stack top.
3743
3744
3745 4.4.5  Normal Division
3746
3747 FDIV     //source/destination,source
3748 FDIVP    //destination,source
3749 FIDIV    source
3750
3751 The normal division instructions (divide real, divide real and pop, integer
3752 divide) divide the destination by the source and return the quotient to the
3753 destination.
3754
3755
3756 4.4.6  Reversed Division
3757
3758 FDIVR    //source/destination,source
3759 FDIVRP   //destination,source
3760 FIDIVR   source
3761
3762 The reversed division instructions (divide real reversed, divide real
3763 reversed and pop, integer divide reversed) divide the source operand by the
3764 destination and return the quotient to the destination.
3765
3766
3767 4.4.7  FSQRT
3768
3769 FSQRT (square root) replaces the content of the top stack element with its
3770 square root. (Note: The square root of -0 is defined to be -0.)
3771
3772
3773 4.4.8  FSCALE
3774
3775 FSCALE (scale) interprets the value contained in ST(1) as an integer and
3776 adds this value to the exponent of the number in ST. This is equivalent to
3777
3778 ST \e ST * 2^(ST(1))
3779
3780 Thus, FSCALE provides rapid multiplication or division by integral powers
3781 of 2. It is particularly useful for scaling the elements of a vector.
3782
3783 There is no limit on the range of the scale factor in ST(1). If the value
3784 is not integral, FSCALE uses the nearest integer smaller in magnitude; i.e.,
3785 it chops the value toward 0. If the resulting integer is zero, the value in
3786 ST is not changed.
3787
3788
3789 4.4.9  FPREM ‘‘ Partial Remainder (80287/8087-Compatible)
3790
3791 FPREM computes the remainder of division of ST by ST(1) and leaves the
3792 result in ST. FPREM finds a remainder REM and a quotient Q such that
3793
3794 REM = ST - ST(1)*Q
3795
3796 The quotient Q is chosen to be the integer obtained by chopping the exact
3797 value of ST/ST(1) toward zero. The sign of the remainder is the same as the
3798 sign of the original dividend from ST.
3799
3800 By ignoring precision control, the 80387 produces an exact result with
3801 FPREM. The precision (inexact) exception does not occur and the rounding
3802 control has no effect.
3803
3804 The FPREM instruction is not the remainder operation specified in the IEEE
3805 standard. To get that remainder, the FPREM1 instruction should be used.
3806
3807 The FPREM instruction is designed to be executed iteratively in a
3808 software-controlled loop. It operates by performing successive scaled
3809 subtractions; therefore, obtaining the exact remainder when the operands
3810 differ greatly in magnitude can consume large amounts of execution time.
3811 Because the 80387 can only be preempted between instructions, the remainder
3812 function could seriously increase interrupt latency in these cases. For
3813 this reason, the maximum number of iterations is limited. The instruction
3814 may terminate before it has completely terminated the calculation. The C2
3815 bit of the status word indicates whether the calculation is complete or
3816 whether the instruction must be executed again.
3817
3818 FPREM can reduce the exponent of ST by up to (but not including) 64 in one
3819 execution. If FPREM produces a remainder that is less than the modulus
3820 (i.e., the divisor), the function is complete and bit C2 of the status word
3821 condition code is cleared. If the function is incomplete, C2 is set to 1;
3822 the result in ST is then called the partial remainder. Software can inspect
3823 C2 by storing the status word following execution of FPREM, reexecuting the
3824 instruction (using the partial remainder in ST as the dividend) until C2 is
3825 cleared. A higher priority interrupting routine that needs the 80387 can
3826 force a context switch between the instructions in the remainder loop.
3827
3828 An important use for FPREM is to reduce arguments (operands) of
3829 transcendental functions to the range permitted by these instructions. For
3830 example, the FPTAN (tangent) instruction requires its argument ST to be less
3831 than 2^(63). For Ò/4 < �ST� < 2^(63), FPTAN (as well as the other
3832 trigonometric instructions) performs an internal reduction of ST to a value
3833 less than Ò/4 using an internally stored Ò/4 divisor that has 67 significant
3834 bits. Because of its greater accuracy, this method of reduction is
3835 recommended when the argument is within the required range.
3836
3837 However, when �ST� � 2^(63), FPREM can be employed to reduce ST. With Ò/4 as
3838 a modulus, FPREM can reduce an argument so that it is within range of FPTAN
3839 and so that no further reduction is required by FPTAN.
3840
3841 Because FPREM produces an exact result, the argument reduction does not
3842 introduce roundoff error into the calculation, even if several iterations
3843 are required to bring the argument into range. However, Ò is never accurate.
3844 The rounding of Ò, when it is used by FPREM to reduce an argument for a
3845 periodic trigonometric function, does not create the effect of a rounded
3846 argument, but of a rounded period.
3847
3848 When reduction is complete, FPREM provides the least-significant three bits
3849 of the quotient generated by FPREM (in C{3}, C{1}, C{0}). This is also
3850 important for transcendental argument reduction, because it locates the
3851 original angle in the correct one of eight Ò/4 segments of the unit circle
3852 (see Table 4-4).
3853
3854
3855 Table 4-4.  Condition Code Interpretation after FPREM and FPREM1
3856             Instructions
3857
3858 ’‘‘ Condition Code ‘‘“             Interpretation after
3859 C2(PF)  C3    C1    C0             FPREM and FPREM1
3860 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
3861                                    Incomplete Reduction:
3862   1     X     X     X        ‘‘‘\x10   further interation required
3863                                     or complete reduction
3864         Q1    Q0    Q2  Q MOD 8
3865
3866         0     0     0     0  ‘“
3867         0     1     0     1   �
3868         1     0     0     2   �    Complete Reduction:
3869   0     1     1     0     3   –‘\x10   C0, C3, C1 contain three least
3870         0     0     1     4   �     significant bits of quotient
3871         0     1     1     5   �
3872         1     0     1     6   �
3873         1     1     1     7  ‘•
3874
3875
3876 4.4.10  FPREM1‘‘Partial Remainder (IEEE Std. 754-Compatible)
3877
3878 FPREM1 computes the remainder of division of ST by ST(1) and leaves the
3879 result in ST. FPREM1 finds a remainder REM1 and a quotient Q1 such that
3880
3881 REM1 = ST - ST(1)*Q1
3882
3883 The quotient Q1 is chosen to be the integer nearest to the exact value of
3884 ST/ST(1). When ST/ST(1) is exactly N + 1/2 (for some integer N), there are
3885 two integers equally close to ST/ST(1). In this case the value chosen for Q1
3886 is the even integer.
3887
3888 The result produced by FPREM1 is always exact; no rounding is necessary,
3889 and therefore the precision exception does not occur and the rounding
3890 control has no effect.
3891
3892 The FPREM1 instruction is designed to be executed iteratively in a
3893 software-controlled loop. FPREM1 operates by performing successive scaled
3894 subtractions; therefore, obtaining the exact remainder when the operands
3895 differ greatly in magnitude can consume large amounts of execution time.
3896 Because the 80387 can only be preempted between instructions, the remainder
3897 function could seriously increase interrupt latency in these cases. For
3898 this reason, the maximum number of iterations is limited. The instruction
3899 may terminate before it has completely terminated the calculation. The C2
3900 bit of the status word indicates whether the calculation is complete or
3901 whether the instruction must be executed again.
3902
3903 FPREM1 can reduce the exponent of ST by up to (but not including) 64 in one
3904 execution. If FPREM1 produces a remainder that is less than the modulus
3905 (i.e., the divisor), the function is complete and bit C2 of the status word
3906 condition code is cleared. If the function is incomplete, C2 is set to 1;
3907 the result in ST is then called the partial remainder. Software can inspect
3908 C2 by storing the status word following execution of FPREM1, reexecuting
3909 the instruction (using the partial remainder in ST as the dividend) until C2
3910 is cleared. When C2 is cleared, FPREM1 also provides the least-significant
3911 three bits of the quotient generated by FPREM1 (in C{3}, C{1}, C{0}).
3912
3913 The uses for FPREM1 are the same as those for FPREM.
3914
3915 FPREM1 differs from FPREM it these respects:
3916
3917   Ž  FPREM and FPREM1 choose the value of the quotient differently; the
3918      low-order three bits of the quotient as reported in bits C3, C1, C0 of
3919      the status word may differ by one in some cases.
3920
3921   Ž  FPREM and FPREM1 may produce different remainders. FPREM produces a
3922      remainder R such that 0 ¾ R < �ST(1)� or -�ST(1)� < R ¾ 0, depending
3923      on the sign of the dividend. FPREM1 produces a remainder R1 such that
3924      -�ST(1)�/2 < R1 < +�ST(1)�/2.
3925
3926
3927 4.4.11  FRNDINT
3928
3929 FRNDINT (round to integer) rounds the top stack element to an integer
3930 according to the RC bits of the control word. For example, assume that ST
3931 contains the 80387 real number encoding of the decimal value 155.625.
3932 FRNDINT will change the value to 155 if the RC field of the control word is
3933 set to down or chop, or to 156 if it is set to up or nearest.
3934
3935
3936 4.4.12  FXTRACT
3937
3938 FXTRACT (extract exponent and significand) performs a superset of the
3939 IEEE-recommended logb(x) function by "decomposing" the number in the stack
3940 top into two numbers that represent the actual value of the operand's
3941 exponent and significand fields. The "exponent" replaces the original
3942 operand on the stack and the "significand" is pushed onto the stack. (ST(7)
3943 must be empty to avoid causing the invalid-operation exception.) Following
3944 execution of FXTRACT, ST (the new stack top) contains the value of the
3945 original significand expressed as a real number: its sign is the same as the
3946 operand's, its exponent is 0 true (16,383 or 3FFFH biased), and its
3947 significand is identical to the original operand's. ST(1) contains the value
3948 of the original operand's true (unbiased) exponent expressed as a real
3949 number.
3950
3951 If the original operand is zero, FXTRACT leaves -ý in ST(1) (the exponent)
3952 while ST is assigned the value zero with a sign equal to that of the
3953 original operand. The zero-divide exception is raised in this case, as well.
3954
3955 To illustrate the operation of FXTRACT, assume that ST contains a number
3956 whose true exponent is +4 (i.e., its exponent field contains 4003H). After
3957 executing FXTRACT, ST(1) will contain the real number +4.0; its sign will be
3958 positive, its exponent field will contain 4001H (+2 true) and its
3959 significand field will contain 1{\x1e}00...00B. In other words, the value in
3960 ST(1) will be 1.0 * 2² = 4. If ST contains an operand whose true exponent
3961 is -7 (i.e., its exponent field contains 3FF8H), then FXTRACT will return an
3962 "exponent" of -7.0; after the instruction executes, ST(1)'s sign and
3963 exponent fields will contain C001H (negative sign, true exponent of 2), and
3964 its significand will be 1{\x1e}1100...00B. In other words, the value in ST(1)
3965 will be -1.75 * 2² = -7.0. In both cases, following FXTRACT, ST's sign and
3966 significand fields will be the same as the original operand's, and its
3967 exponent field will contain 3FFFH (0 true).
3968
3969 FXTRACT is useful for power and range scaling operations. Both FXTRACT and
3970 the base 2 exponential instruction F2XM1 are needed to perform a general
3971 power operation. Converting numbers in 80387 extended real format to decimal
3972 representations (e.g., for printing or displaying) requires not only FBSTP
3973 but also FXTRACT to allow scaling that does not overflow the range of the
3974 extended format. FXTRACT can also be useful for debugging, because it allows
3975 the exponent and significand parts of a real number to be examined
3976 separately.
3977
3978
3979 4.4.13  FABS
3980
3981 FABS (absolute value) changes the top stack element to its absolute value
3982 by making its sign positive. Note that the invalid-operation exception is
3983 not signaled even if the operand is a signaling NaN or has a format that is
3984 not supported.
3985
3986
3987 4.4.14  FCHS
3988
3989 FCHS (change sign) complements (reverses) the sign of the top stack
3990 element. Note that the invalid-operation exception is not signaled even if
3991 the operand is a signaling NaN or has a format that is not supported.
3992
3993
3994 4.5  Comparison Instructions
3995
3996 The instructions of this class allow comparison of numbers of all supported
3997 real and integer data types. Each of these instructions (Table 4-5)
3998 analyzes the top stack element, often in relationship to another operand,
3999 and reports the result as a condition code in the status word.
4000
4001 The basic operations are compare, test (compare with zero), and examine
4002 (report type, sign, and normalization). Special forms of the compare
4003 operation are provided to optimize algorithms by allowing direct comparisons
4004 with binary integers and real numbers in memory, as well as popping the
4005 stack after a comparison.
4006
4007 The FSTSW (store status word) instruction may be used following a
4008 comparison to transfer the condition code to memory or to the 80386 AX
4009 register for inspection. The 80386 SAHF instruction is recommended for
4010 copying the 80387 flags from AX to the 80386 flags for easy conditional
4011 branching.
4012
4013 Note that instructions other than those in the comparison group may update
4014 the condition code. To ensure that the status word is not altered
4015 inadvertently, store it immediately following a comparison operation.
4016
4017
4018 Table 4-5.  Comparison Instructions
4019
4020 FCOM           Compare real
4021 FCOMP          Compare real and pop
4022 FCOMPP         Compare real and pop twice
4023 FICOM          Integer compare
4024 FICOMP         Integer compare and pop
4025 FTST           Test
4026 FUCOM          Unordered compare real
4027 FUCOMP         Unordered compare real and pop
4028 FUCOMPP        Unordered compare real and pop twice
4029 FXAM           Examine
4030
4031
4032 4.5.1  FCOM //source
4033
4034 FCOM (compare real) compares the stack top to the source operand. The
4035 source operand may be a register on the stack, or a single or double real
4036 memory operand. If an operand is not coded, ST is compared to ST(1). The
4037 sign of zero is ignored, so that +0 = -0. Following the instruction, the
4038 condition codes reflect the order of the operands as shown in Table 4-6.
4039
4040 If either operand is a NaN (either quiet or signaling) or an undefined
4041 format, or if a stack fault occurs, the invalid-operation exception is
4042 raised and the condition bits are set to "unordered."
4043
4044
4045 Table 4-6.  Condition Code Resulting from Comparisons
4046
4047                                                 80386
4048 Order           C3 (ZF)   C2 (PF)   C0 (CF)     Conditional
4049                                                 Branch
4050
4051 ST > Operand    0         0         0           JA
4052 ST < Operand    0         0         1           JB
4053 ST = Operand    1         0         0           JE
4054 Unordered       1         1         1           JP
4055
4056
4057 4.5.2  FCOMP //source
4058
4059 FCOMP (compare real and pop) operates like FCOM, and in addition pops the
4060 stack.
4061
4062
4063 4.5.3  FCOMPP
4064
4065 FCOMPP (compare real and pop twice) operates like FCOM and additionally
4066 pops the stack twice, discarding both operands. FCOMPP always compares ST to
4067 ST(1); no operands may be explicitly specified.
4068
4069
4070 4.5.4  FICOM source
4071
4072 FICOM (integer compare) converts the source operand, which may reference a
4073 word or short binary integer variable, to extended real and compares the
4074 stack top to it. The condition code bits in the status word are set as for
4075 FCOM.
4076
4077
4078 4.5.5  FICOMP source
4079
4080 FICOMP (integer compare and pop) operates identically to FICOM and
4081 additionally discards the value in ST by popping the NPX stack.
4082
4083
4084 4.5.6  FTST
4085
4086 FTST (test) tests the top stack element by comparing it to zero. The result
4087 is posted to the condition codes as shown in Table 4-7.
4088
4089
4090 Table 4-7.  Condition Code Resulting from FTST
4091
4092                                                 83086
4093 Order           C3 (ZF)   C2 (ZF)   C0 (ZF)     Conditional
4094                                                 Branch
4095
4096 ST > 0.0        0         0         0           JA
4097 ST < 0.0        0         0         1           JB
4098 ST = 0.0        1         0         0           JE
4099 Unordered       1         1         1           JP
4100
4101
4102 4.5.7  FUCOM //source
4103
4104 FUCOM (unordered compare real) operates like FCOM, with two differences:
4105
4106   1.  It does not cause an invalid-operation exception when one of the
4107       operands is a NaN. If either operand is a NaN, the condition bits of
4108       the status word are set to unordered as shown in Table 4-6.
4109
4110   2.  Only operands on the NPX stack can be compared.
4111
4112
4113 4.5.8  FUCOMP //source
4114
4115 FUCOMP (unordered compare real and pop) operates like FUCOM and in addition
4116 pops the NPX stack.
4117
4118
4119 4.5.9  FUCOMPP
4120
4121 FUCOMPP (unordered compare real and pop) operates like FUCOM and in
4122 addition pops the NPX stack twice, discarding both operands. FUCOMPP always
4123 compares ST to ST(1); no operands can be explicitly specified.
4124
4125
4126 4.5.10  FXAM
4127
4128 FXAM (examine) reports the content of the top stack element as
4129 positive/negative and NaN, denormal, normal, zero, infinity, unsupported, or
4130 empty. Table 4-8 lists and interprets all the condition code values that
4131 FXAM generates.
4132
4133
4134 4.6  Transcendental Instructions
4135
4136 The instructions in this group (Table 4-9) perform the time-consuming core
4137 calculations for all common trigonometric, inverse trigonometric,
4138 hyperbolic, inverse hyperbolic, logarithmic, and exponential functions. The
4139 transcendentals operate on the top one or two stack elements, and they
4140 return their results to the stack. The trigonometric operations assume their
4141 arguments are expressed in radians. The logarithmic and exponential
4142 operations work in base 2.
4143
4144 The results of transcendental instructions are highly accurate. The
4145 absolute value of the relative error of the transcendental instructions is
4146 guaranteed to be less than 2^(-62). (Relative error is the ratio between the
4147 absolute error and the exact value.)
4148
4149 The trigonometric functions accept a practically unrestricted range of
4150 operands, whereas the other transcendental instructions require that
4151 arguments be more restricted in range. FPREM or FPREM1 may be used to bring
4152 the otherwise valid operand of a periodic function into range. Prologue and
4153 epilogue software may be used to reduce arguments for other instructions to
4154 the expected range and to adjust the result to correspond to the original
4155 arguments if necessary. The instruction descriptions in this section
4156 document the allowed operand range for each instruction.
4157
4158
4159 Table 4-8.  Condition Code Defining Operand Class
4160
4161 C3  C2  C1  C0    Value at TOP
4162
4163 0   0   0   0    +Unsupported
4164 0   0   0   1    +NaN
4165 0   0   1   0    -Unsupported
4166 0   0   1   1    -NaN
4167 0   1   0   0    +Normal
4168 0   1   0   1    +Infinity
4169 0   1   1   0    -Normal
4170 0   1   1   1    -Infinity
4171 1   0   0   0    +0
4172 1   0   0   1    +Empty
4173 1   0   1   0    -0
4174 1   0   1   1    -Empty
4175 1   1   0   0    +Denormal
4176 1   1   1   0    -Denormal
4177
4178
4179 Table 4-9.  Transcendental Instructions
4180
4181 FSIN         Sine
4182 FCOS         Cosine
4183 FSINCOS      Sine and cosine
4184 FPTAN        Tangent of ST
4185 FPATAN       Arctangent of ST(1)/ST
4186 F2XM1        2{X-1}
4187 FYL2X        Y * log{2}X;        Y is ST(1), X is ST
4188 FYL2XP1      Y * log{2}(X + 1);  Y is ST(1), X is ST
4189
4190
4191 4.6.1  FCOS
4192
4193 When complete, this function replaces the contents of ST with COS(ST). ST,
4194 expressed in radians, must lie in the range �Ú� < 2^(63) (for most practical
4195 purposes unrestricted). If ST is in range, C2 of the status word is cleared
4196 and the result of the operation is produced.
4197
4198 If the operand is outside of the range, C2 is set to one (function
4199 incomplete) and ST remains intact (i.e., no reduction of the operand is
4200 performed). It is the programmers responsibility to reduce the operand to an
4201 absolute value smaller than 2^(63). The instructions FPREM1 and FPREM are
4202 available for this purpose.
4203
4204
4205 4.6.2  FSIN
4206
4207 When complete, this function replaces the contents of ST with SIN(ST). FSIN
4208 is equivalent to FCOS in the way it reduces the operand. ST is expressed in
4209 radians.
4210
4211
4212 4.6.3  FSINCOS
4213
4214 When complete, this instruction replaces the contents of ST with SIN(ST),
4215 then pushes COS(ST) onto the stack. (ST(7) must be empty to avoid an invalid
4216 exception.) FSINCOS is equivalent to FCOS in the way it reduces the operand.
4217 ST is expressed in radians.
4218
4219
4220 4.6.4  FPTAN
4221
4222 When complete, FPTAN (partial tangent) computes the function Y = TAN (ST).
4223 ST is expressed in radians. Y replaces ST, then the value 1 is pushed,
4224 becoming the new stack top. (ST(7) must be empty to avoid an invalid
4225 exception.) When the function is complete ST(1) = TAN (arg) and ST = 1.
4226 FPTAN is equivalent to FCOS in the way it reduces the operand.
4227
4228 The fact that FPTAN places two results on the stack maintains compatibility
4229 with the 8087/80287 and aids the calculation of other trigonometric
4230 functions that can be derived from tan via standard trigonometric
4231 identities. For example, the cot function is given by this identity:
4232
4233 cot x = 1/tan x.
4234
4235 Therefore, simply executing the reverse divide instruction FDIVR after
4236 FPTAN yields the cot function.
4237
4238
4239 4.6.5  FPATAN
4240
4241 FPATAN (arctangent) computes the function Ú = ARCTAN (Y/X). X is taken from
4242 ST(0) and Y from ST(1). The instruction pops the NPX stack and returns Ú to
4243 the (new) stack top, overwriting the Y operand. The result is expressed in
4244 radians. The range of operands is not restricted; however, the range of the
4245 result depends on the relationship between the operands according to Table
4246 4-10.
4247
4248 The fact that the argument of FPATAN is a ratio aids calculation of other
4249 trigonometric functions, including Arcsin and Arccos. These can be derived
4250 from Arctan via standard trigonometric identities. For example, the Arcsin
4251 function can be easily calculated using this identity:
4252
4253 Arcsin x = Arctan (x / ¹(1 - x²)).
4254
4255 Thus, to find Arcsin (Y), push Y onto the NPX stack, then calculate
4256 X = ¹(1 - Y²), pushing the result X onto the stack. Executing FPATAN then
4257 leaves Arcsin (Y) at the top of the stack.
4258
4259
4260 4.6.6  F2XM1
4261
4262 F2XM1 (2 to the X minus 1) calculates the function Y = 2^(X) - 1. X is taken
4263 from the stack top and must be in the range -1 ¾ X ¾ 1. The result Y
4264 replaces the argument X at the stack top. If the argument is out of range,
4265 the results are undefined.
4266
4267 This instruction is designed to produce a very accurate result even when X
4268 is close to 0. For values of the argument very close in magnitude to 1, a
4269 larger error will be incurred. To obtain Y = 2^(X), add 1 to the result
4270 delivered by F2XM1.
4271
4272 The following formulas show how values other than 2 may be raised to a
4273 power of X:
4274
4275 10^(X) = 2^(X * LOG2(10))
4276
4277 e^(X) = 2^(X * LOG2(e))
4278
4279 y^(X) = 2^(X * LOG2(Y))
4280
4281 As shown in the next section, the 80387 has built-in instructions for
4282 loading the constants LOG{2}10 and LOG{2}e, and the FYL2X instruction may be
4283 used to calculate X*LOG{2}Y.
4284
4285
4286 Table 4-10.  Results of FPATAN
4287
4288 Sign(Y)    Sign(X)     �Y� < �X�?      Final Result
4289
4290 +          +           Yes                   0 < atan(Y/X) < Ò/4
4291 +          +           No                  Ò/4 < atan(Y/X) < Ò/2
4292 +          -           No                  Ò/2 < atan(Y/X) < 3 * Ò/4
4293 +          -           Yes             3 * Ò/4 < atan(Y/X) < Ò
4294 -          +           Yes                -Ò/4 < atan(Y/X) < 0
4295 -          +           No                 -Ò/2 < atan(Y/X) < -Ò/4
4296 -          -           No             -3 * Ò/4 < atan(Y/X) < -Ò/2
4297 -          -           Yes                  -Ò < atan(Y/X) < -3 * Ò/4
4298
4299
4300 4.6.7  FYL2X
4301
4302 FYL2X (Y log base 2 of X) calculates the function Z = Y * LOG{2}X. X is
4303 taken from the stack top and Y from ST(1). The operands must be in the
4304 following ranges:
4305
4306 0 ¾ X < +ý
4307 -ý < Y < +ý
4308
4309 The instruction pops the NPX stack and returns Z at the (new) stack top,
4310 replacing the Y operand. If the operand is out of range (i.e., in negative)
4311 the invalid-operation exception occurs.
4312
4313 This function optimizes the calculations of log to any base other than two,
4314 because a multiplication is always required:
4315
4316 LOG{N}x = (LOG{2}N){-1} * LOG{2}x
4317
4318
4319 4.6.8  FYL2XP1
4320
4321 FYL2XP1 (Y log base 2 of (X + 1)) calculates the function Z = Y*LOG{2}
4322 (X+1). X is taken from the stack top and must be in the range -(1-SQRT(2)/2)
4323 < X <1-SQRT(2)/2. Y is taken from  ST(1) and is unlimited in range (-ý < Y
4324 < +ý). FYL2XP1 pops the stack and returns Z at the (new) stack top,
4325 replacing Y. If the argument is out of range, the results are undefined.
4326
4327 This instruction provides improved accuracy over FYL2X when computing the
4328 logarithm of a number very close to 1, for example 1 + ¯ where ¯ << 1.
4329 Providing ¯ rather than 1 + ¯ as the input to the function allows more
4330 significant digits to be retained.
4331
4332
4333 Table 4-11.  Constant Instructions
4334
4335 FLDZ     Load + 0.0
4336 FLD1     Load + 1.0
4337 FLDPI    Load Ò
4338 FLDL2T   Load log{2}10
4339 FLDL2E   Load log{2}e
4340 FLDLG2   Load log{10}2
4341 FLDLN2   Load log{e}2
4342
4343
4344 4.7  Constant Instructions
4345
4346 Each of these instructions (Table 4-11) loads (pushes) a commonly used
4347 constant onto the stack. (ST(7) must be empty to avoid an invalid
4348 exception.) The values have full extended real precision (64 bits) and are
4349 accurate to approximately 19 decimal digits. Because an external real
4350 constant occupies 10 memory bytes, the constant instructions, which are
4351 only two bytes long, save storage and improve execution speed, in addition
4352 to simplifying programming.
4353
4354 The constants used by these instructions are stored internally in a format
4355 more precise even than extended real. When loading the constant, the 80387
4356 rounds the more precise internal constant according the RC (rounding
4357 control) bit of the control word. However, in spite of this rounding, the
4358 precision exception is not raised (to maintain compatibility). When the
4359 rounding control is set to round to nearest on the 80387, the 80387
4360 produces the same constant that is produced by the 80287.
4361
4362
4363 4.7.1  FLDZ
4364
4365 FLDZ (load zero) loads (pushes) +0.0 onto the NPX stack.
4366
4367
4368 4.7.2  FLD1
4369
4370 FLD1 (load one) loads (pushes) +1.0 onto the NPX stack.
4371
4372
4373 4.7.3  FLDPI
4374
4375 FLDPI (load Ò) loads (pushes) Ò onto the NPX stack.
4376
4377
4378 4.7.4  FLDL2T
4379
4380 FLDL2T (load log base 2 of 10) loads (pushes) the value LOG{2}10 onto the
4381 NPX stack.
4382
4383
4384 4.7.5  FLDL2E
4385
4386 FLDL2E (load log base 2 of e) loads (pushes) the value LOG{2}e onto the NPX
4387 stack.
4388
4389
4390 4.7.6  FLDLG2
4391
4392 FLDLG2 (load log base 10 of 2) loads (pushes) the value LOG{10}2 onto the
4393 NPX stack.
4394
4395
4396 4.7.7  FLDLN2
4397
4398 FLDLN2 (load log base e of 2) loads (pushes) the value LOG{e}2 onto the NPX
4399 stack.
4400
4401
4402 4.8  Processor Control Instructions
4403
4404 The processor control instructions are shown in Table 4-12. The instruction
4405 FSTSW is commonly used for conditional branching. The remaining instructions
4406 are not typically used in calculations; they provide control over the 80387
4407 NPX for system-level activities. These activities include initialization,
4408 exception handling, and task switching.
4409
4410 As shown in Table 4-12, many of the NPX processor control instructions have
4411 two forms of assembler mnemonic:
4412
4413   1.  A wait form, where the mnemonic is prefixed only with an F, such as
4414       FSTSW. This form checks for unmasked numeric exceptions.
4415
4416   2.  A no-wait form, where the mnemonic is prefixed with an FN, such as
4417       FNSTSW. This form ignores unmasked numeric exceptions.
4418
4419 When the control instruction is coded using the no-wait form of the
4420 mnemonic, the ASM386 assembler does not precede the ESC instruction with a
4421 wait instruction, and the CPU does not test the ERROR# status line from the
4422 NPX before executing the processor control instruction.
4423
4424 Only the processor control class of instructions have this alternate
4425 no-wait form. All numeric instructions are automatically synchronized by the
4426 80386; the CPU transfers all operands before initiating the next
4427 instruction. Because of this automatic synchronization by the 80386, numeric
4428 instructions for the 80387 need not be preceded by a CPU wait instruction
4429 in order to execute correctly.
4430
4431 It should also be noted that the 8087 instructions FENI and FDISI and the
4432 80287 instruction FSETPF perform no function in the 80387. If these opcodes
4433 are detected in an 80386/80387 instruction stream, the 80387 performs no
4434 specific operation and no internal states are affected. For programmers
4435 interested in porting numeric software from 80287 or 8087 environments to
4436 the 80386, however, it should be noted that program sections containing
4437 these exception-handling instructions are not likely to be completely
4438 portable to the 80387. Appendix C and Appendix D contains a more complete
4439 description of the differences between the 80387 and the 80287/8087.
4440
4441
4442 Table 4-12.  Processor Control Instructions
4443
4444 FINIT/FNINIT           Initialize processor
4445 FLDCW                  Load control word
4446 FSTCW/FNSTCW           Store control word
4447 FSTSW/FNSTSW           Store status word
4448 FSTSW AX/FNSTSW AX     Store status word to AX
4449 FCLEX/FNCLEX           Clear exceptions
4450 FSTENV/FNSTENV         Store environment
4451 FLDENV                 Load environment
4452 FSAVE/FNSAVE           Save state
4453 FRSTOR                 Restore state
4454 FINCSTP                Increment stack pointer
4455 FDECSTP                Decrement stack pointer
4456 FFREE                  Free register
4457 FNOP                   No operation
4458 FWAIT                  CPU Wait
4459
4460
4461 4.8.1  FINIT/FNINIT
4462
4463 FINIT/FNINIT (initialize processor) sets the 80387 NPX into a known state,
4464 unaffected by any previous activity. It sets the control word to its default
4465 value 037FH (round to nearest, all exceptions masked, 64 bits of precision),
4466 clears the status word, and empties all floating-point stack registers. The
4467 no-wait form of this instruction causes the 80387 to abort any previous
4468 numeric operations currently executing in the NEU.
4469
4470 This instruction performs the functional equivalent of a hardware RESET,
4471 with one exception: RESET causes the IM bit of the control word to be reset
4472 and the ES and IE bits of the status word to be set as a means of signaling
4473 the presence of an 80387; FINIT puts the opposite values in these bits.
4474
4475 FINIT checks for unmasked numeric exceptions, FNINIT does not. Note that if
4476 FNINIT is executed while a previous 80387 memory-referencing instruction is
4477 running, 80387 bus cycles in progress are aborted. This instruction may be
4478 necessary to clear the 80387 if a processor-extension segment-overrun
4479 exception (interrupt 9) is detected by the CPU.
4480
4481
4482 4.8.2  FLDCW source
4483
4484 FLDCW (load control word) replaces the current processor control word with
4485 the word defined by the source operand. This instruction is typically used
4486 to establish or change the 80387's mode of operation. Note that if an
4487 exception bit in the status word is set, loading a new control word that
4488 unmasks that exception will activate the ERROR# output of the 80387. When
4489 changing modes, the recommended procedure is to first clear any exceptions
4490 and then load the new control word.
4491
4492
4493 4.8.3  FSTCW/FNSTCW destination
4494
4495 FSTCW/FNSTCW (store control word) writes the processor control word to the
4496 memory location defined by the destination. FSTCW checks for unmasked
4497 numeric exceptions; FNSTCW does not.
4498
4499
4500 4.8.4  FSTSW/FNSTSW destination
4501
4502 FSTSW/FNSTSW (store status word) writes the current value of the 80387
4503 status word to the destination operand in memory. The instruction is used to
4504
4505   Ž  Implement conditional branching following a comparison, FPREM, or
4506      FPREM1 instruction (FSTSW).
4507
4508   Ž  Invoke exception handlers (by polling the exception bits) in
4509      environments that do not use interrupts (FSTSW).
4510
4511 FSTSW checks for unmasked numeric exceptions, FNSTSW does not.
4512
4513
4514 4.8.5  FSTSW AX/FNSTSW AX
4515
4516 FSTSW AX/FNSTSW AX (store status word to AX) is a special 80387 instruction
4517 that writes the current value of the 80387 status word directly into the
4518 80386 AX register. This instruction optimizes conditional branching in
4519 numeric programs, where the 80386 CPU must test the condition of various NPX
4520 status bits. The waited form FSTSW AX checks for unmasked numeric
4521 exceptions, the non-waited form FNSTSW AX does not.
4522
4523 When this instruction is executed, the 80386 AX register is updated with
4524 the NPX status word before the CPU executes any further instructions. The
4525 status stored is that from the completion of the prior ESC instruction.
4526
4527
4528 4.8.6  FCLEX/FNCLEX
4529
4530 FCLEX/FNCLEX (clear exceptions) clears all exception flags, the exception
4531 status flag and the busy flag in the status word. As a consequence, the
4532 80387's ERROR# line goes inactive. FCLEX checks for unmasked numeric
4533 exceptions, FNCLEX does not.
4534
4535
4536 4.8.7  FSAVE/FNSAVE destination
4537
4538 FSAVE/FNSAVE (save state) writes the full 80387 state‘‘environment plus
4539 register stack‘‘to the memory location defined by the destination operand.
4540 Figure 4-1 and Figure 4-2 show the layout of the save area; the size and
4541 layout of the save the operating mode of the 80386 (real-address mode or
4542 protected mode) and on the operand-size attribute in effect for the
4543 instruction (32-bit operand or 16-bit operand). When the 80386 is in
4544 virtual-8086 mode, the real-address mode formats are used. Typically the
4545 instruction is coded to save this image on the CPU stack.
4546
4547 The values in the tag word in memory are determined during the execution of
4548 FSAVE/FNSAVE. If the tag in the status register indicates that the
4549 corresponding register is nonempty, the 80387 examines the data in the
4550 register and stores the appropriate tag in memory. Thus the tag that is
4551 stored always reflects the actual content of the register.
4552
4553 FNSAVE delays its execution until all NPX activity completes normally.
4554 Thus, the save image reflects the state of the NPX following the completion
4555 of any running instruction. After writing the state image to memory,
4556 FSAVE/FNSAVE initializes the 80387 as if FINIT/FNINIT had been executed.
4557
4558 FSAVE/FNSAVE is useful whenever a program wants to save the current state
4559 of the NPX and initialize it for a new routine. Three examples are
4560
4561   1.  An operating system needs to perform a context switch (suspend the
4562       task that had been running and give control to a new task).
4563
4564   2.  An exception handler needs to use the 80387.
4565
4566   3.  An application task wants to pass a "clean" 80387 to a subroutine.
4567
4568 FSAVE checks for unmasked numeric exceptions before executing, FNSAVE does
4569 not.
4570
4571
4572 Figure 4-1.  FSAVE/FRSTOR Memory Layout (32-Bit)
4573
4574                41          23          15          7         0
4575               ‚�����������Ï�����������Ï�����������Ï�����������ƒ+0H
4576               Ã‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+4H
4577               Ã‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘       ‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+8H
4578               Ã‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘ ENVIRONMENT ‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+CH
4579               Ã‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘       ‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+10H
4580               Ã‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+14H
4581               Ã‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘‘Â+18H
4582               „�����������Ï�����������Ï�����������Ï�����������…
4583
4584         ‚����Ð��������Ð����������������������������������������������ƒ
4585    ST(0)€SIGN�EXPONENT�                  SIGNIFICAND                 €+1CH
4586    ST(1)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+26H
4587    ST(2)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+30H
4588    ST(3)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+3AH
4589    ST(4)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+44H
4590    ST(5)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+4EH
4591    ST(6)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+58H
4592    ST(7)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+62H
4593         „����¤��������¤����������������������������������������������…
4594            79 78    64 63                                           0
4595
4596
4597 Figure 4-2.  FSAVE/FRSTOR Memory Layout (16-Bit)
4598
4599                           15         7        0
4600                          ‚����������Ï����������ƒ+0H
4601                          Ã‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘Â+2H
4602                          Ã‘‘‘‘‘‘‘       ‘‘‘‘‘‘‘Â+4H
4603                          Ã‘‘‘‘ ENVIRONMENT ‘‘‘‘Â+6H
4604                          Ã‘‘‘‘‘‘‘       ‘‘‘‘‘‘‘Â+8H
4605                          Ã‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘Â+AH
4606                          Ã‘‘‘‘‘‘‘‘‘‘|‘‘‘‘‘‘‘‘‘‘Â+CH
4607                          „����������Ï����������…
4608
4609         ‚����Ð��������Ð����������������������������������������������ƒ
4610    ST(0)€SIGN�EXPONENT�                  SIGNIFICAND                 €+EH
4611    ST(1)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+18H
4612    ST(2)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+22H
4613    ST(3)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+2CH
4614    ST(4)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+36H
4615    ST(5)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+40H
4616    ST(6)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+4AH
4617    ST(7)Ã‘‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â+54H
4618         „����¤��������¤����������������������������������������������…
4619            79 78    64 63                                           0
4620
4621
4622 4.8.8  FRSTOR source
4623
4624 FRSTOR (restore state) reloads the 80387 state from the memory area defined
4625 by the source operand. This information should have been written by a
4626 previous FSAVE/FNSAVE instruction and not altered by any other instruction.
4627 FRSTOR automatically waits checking for interrupts until all data transfers
4628 are completed before continuing to the next instruction.
4629
4630 Note that the 80387 "reacts" to its new state at the conclusion of the
4631 FRSTOR. It generates an exception request, for example, if the exception and
4632 mask bits in the memory image so indicate when the next WAIT or
4633 exception-checking ESC instruction is executed.
4634
4635
4636 4.8.9  FSTENV/FNSTENV destination
4637
4638 FSTENV/FNSTENV (store environment) writes the 80387's basic
4639 status‘‘control, status, and tag words, and exception pointers‘‘to the
4640 memory location defined by the destination operand. Typically, the
4641 environment is saved on the CPU stack. FSTENV/FNSTENV is often used by
4642 exception handlers because it provides access to the exception pointers
4643 that identify the offending instruction and operand. After saving the
4644 environment, FSTENV/FNSTENV sets all exception masks in the 80387 control
4645 word (i.e., masks all exceptions). FSTENV checks for pending exceptions
4646 before executing, FNSTENV does not.
4647
4648 Figures 4-3 through 4-6 show the format of the environment data in memory;
4649 the size and layout of the save area depends on the operating mode of the
4650 80386 (real-address mode or protected mode) and on the operand-size
4651 attribute in effect for the instruction (32-bit operand or 16-bit operand).
4652 When the 80386 is in virtual-8086 mode, the real-address mode formats are
4653 used. FNSTENV does not store the environment until all NPX activity has
4654 completed. Thus, the data saved by the instruction reflects the 80387 after
4655 any previously decoded instruction has been executed.
4656
4657 The values in the tag word in memory are determined during the execution of
4658 FNSTENV/FSTENV. If the tag in the status register indicates that the
4659 corresponding register is nonempty, the 80387 examines the data in the
4660 register and stores the appropriate tag in memory. Thus the tag that is
4661 stored always reflects the actual content of the register.
4662
4663
4664 Figure 4-3.  Protected Mode 80387 Environment, 32-Bit Format
4665
4666                       32-BIT PROTECTED MODE FORMAT
4667
4668  31                23                15                7               0
4669 ‚�����������������Ï�����������������Ð�����������������Ï�����������������ƒ
4670 €             RESERVED              �            CONTROL WORD           €0H
4671 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4672 €             RESERVED              �            STATUS WORD            €4H
4673 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4674 €             RESERVED              �              TAG WORD             €8H
4675 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4676 €                               IP OFFSET                               €CH
4677 Ã‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4678 € 0 0 0 0 0�      OPCODE 10..0      �            CS SELECTOR            €10H
4679 Ã‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4680 €                          DATA OPERAND OFFSET                          €14H
4681 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4682 €             RESERVED              �          OPERAND SELECTOR         €18H
4683 „�����������������Ï�����������������¤�����������������Ï�����������������…
4684
4685
4686 Figure 4-4.  Real Mode 80387 Environment, 32-Bit Format
4687
4688                        32-BIT PROTECTED MODE FORMAT
4689
4690  31                23                15                7               0
4691 ‚�����������������Ï�����������������Ð�����������������Ï�����������������ƒ
4692 €             RESERVED              �            CONTROL WORD           €0H
4693 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4694 €             RESERVED              �            STATUS WORD            €4H
4695 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4696 €             RESERVED              �              TAG WORD             €8H
4697 Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4698 €             RESERVED              �      INSTRUCTION POINTER 15..0    €CH
4699 Ã‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4700 € 0 0 0 0 �     INSTRUCTION POINTER 31..16      �0�    OPCODE 10..0     €10H
4701 Ã‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘™‘™‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4702 €             RESERVED              �       OPERAND POINTER 15..0       €14H
4703 Ã‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4704 € 0 0 0 0 �       OPERAND POINTER 31..16        �0 0 0 0 0 0 0 0 0 0 0 0€18H
4705 „���������¤�������Ï�����������������¤�����������¤�����Ï�����������������…
4706
4707
4708 Figure 4-5.  Protected Mode 80387 Environment, 16-Bit Format
4709
4710                         16-BIT PROTECTED MODE FORMAT
4711
4712                      15               7              0
4713                     ‚����������������Ï����������������ƒ
4714                     €          CONTROL WORD           € 0H
4715                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4716                     €           STATUS WORD           € 2H
4717                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4718                     €            TAG WORD             € 4H
4719                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4720                     €            IP OFFSET            € 6H
4721                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4722                     €           CB SELECTOR           € 8H
4723                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4724                     €         OPERAND OFFSET          € AH
4725                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4726                     €        OPERAND SELECTOR         € CH
4727                     „����������������Ï����������������…
4728
4729
4730 Figure 4-6.  Real Mode 80387 Environment, 16-Bit Format
4731
4732                           16-BIT REAL-ADDRESS MODE
4733                         AND VIRTUAL-8086 MODE FORMAT
4734
4735                      15               7              0
4736                     ‚����������������Ï����������������ƒ
4737                     €          CONTROL WORD           € 0H
4738                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4739                     €           STATUS WORD           € 2H
4740                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4741                     €            TAG WORD             € 4H
4742                     Ã‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4743                     €    INSTRUCTION POINTER 15..0    € 6H
4744                     Ã‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4745                     €IP 19..16�0�      OPCODE 10..0   € 8H
4746                     Ã‘‘‘‘‘‘‘‘‘™‘™‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4747                     €      OPERAND POINTER 15..0      € AH
4748                     Ã‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Â
4749                     €OP 19..16�0�0 0 0 0 0 0 0 0 0 0 0€ CH
4750                     „���������¤�¤����Ï����������������…
4751
4752
4753 4.8.10  FLDENV source
4754
4755 FLDENV (load environment) reloads the environment from the memory area
4756 defined by the source operand. This data should have been written by a
4757 previous FSTENV/FNSTENV instruction. CPU instructions (that do not reference
4758 the environment image) may immediately follow FLDENV. FLDENV automatically
4759 waits for all data transfers to complete before executing the next
4760 instruction.
4761
4762 Note that loading an environment image that contains an unmasked exception
4763 causes a numeric exception when the next WAIT or exception-checking ESC
4764 instruction is executed.
4765
4766
4767 4.8.11  FINCSTP
4768
4769 FINCSTP (increment NPX stack pointer) adds 1 to the stack top pointer (TOP)
4770 in the status word. It does not alter tags or register contents, nor does it
4771 transfer data. It is not equivalent to popping the stack, because it does
4772 not set the tag of the previous stack top to empty. Incrementing the stack
4773 pointer when ST=7 produces ST=0.
4774
4775
4776 4.8.12  FDECSTP
4777
4778 FDECSTP (decrement NPX stack pointer) subtracts 1 from ST, the stack top
4779 pointer in the status word. No tags or registers are altered, nor is any
4780 data transferred. Executing FDECSTP when ST=0 produces ST=7.
4781
4782
4783 4.8.13  FFREE destination
4784
4785 FFREE (free register) changes the destination register's tag to empty; the
4786 content of the register is unaffected.
4787
4788
4789 4.8.14  FNOP
4790
4791 FNOP (no operation) effectively performs no operation.
4792
4793
4794 4.8.15  FWAIT (CPU Instruction)
4795
4796 FWAIT is not actually an 80387 instruction, but an alternate mnemonic for
4797 the 80386 WAIT instruction. The FWAIT or WAIT mnemonic should be coded
4798 whenever the programmer wants to check for a pending error before modifying
4799 a variable used in the previous floating-point instruction. Coding an FWAIT
4800 instruction after an 80387 instruction ensures that unmasked numeric
4801 exceptions occur and exception handlers are invoked before the next
4802 instruction has a chance to examine the results of the 80387 instruction.
4803
4804 More information on when to code an FWAIT instruction is given in Chapter 5
4805 in the section "Concurrent Processing with the 80387."
4806
4807
4808
4809 Chapter 5  Programming Numeric Applications
4810
4811 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
4812
4813 5.1  Programming Facilities
4814
4815 As described previously, the 80387 NPX is programmed simply as an extension
4816 of the 80386 CPU. This section describes how programmers in ASM386 and in a
4817 variety of higher-level languages can work with the 80387.
4818
4819 The level of detail in this section is intended to give programmers a basic
4820 understanding of the software tools that can be used with the 80387, but
4821 this information does not document the full capabilities of these
4822 facilities. Complete documentation is available with each program
4823 development product.
4824
4825
4826 5.1.1  High-Level Languages
4827
4828 For programmers using high-level languages, the programming and operation
4829 of the NPX is handled automatically by the compiler. A variety of Intel
4830 high-level languages are available that automatically make use of the 80387
4831 NPX when appropriate. These languages include C-386 and PL/M-386. In
4832 addition many high-level language compilers are available from independent
4833 software vendors.
4834
4835 Each of these high-level languages has special numeric libraries allowing
4836 programs to take advantage of the capabilities of the 80387 NPX. No special
4837 programming conventions are necessary to make use of the 80387 NPX when
4838 programming numeric applications in any of these languages.
4839
4840 Programmers in PL/M-386 and ASM386 can also make use of many of these
4841 library routines by using routines contained in the 80387 Support Library.
4842 These libraries implement many of the functions provided by higher-level
4843 languages, including exception handlers, ASCII-to-floating-point
4844 conversions, and a more complete set of transcendental functions than that
4845 provided by the 80387 instruction set.
4846
4847
4848 5.1.2  C Programs
4849
4850 C programmers automatically cause the C compiler to generate 80387
4851 instructions when they use the double and float data types. The float type
4852 corresponds to the 80387's single real format; the double type corresponds
4853 to the 80387's double real format. The statement #include <math.h> causes
4854 mathematical functions such as sin and sqrt to return values of type
4855 double. Figure 5-1 illustrates the ease with which C programs interface
4856 with the 80387.
4857
4858
4859 Figure 5-1.  Sample C-386 Program
4860
4861 XENIX286 C386 COMPILER,  V0.2 COMPILATION OF MODULE SAMPLE
4862 OBJECT MODULE PLACED IN  sample.obj
4863 COMPILER INVOKED BY:  c386 sample.c
4864
4865 stmt level
4866
4867       1         /******************************************************
4868       2         *                                                      *
4869       3         *                  SAMPLE C PROGRAM                    *
4870       4         *                                                      *
4871       5         ******************************************************/
4872       6
4873       7         /** Include /usr/include/stdio.h if necessary **/
4874       8         /** Include math declarations for transcendenatals and others **/
4875       9
4876      10         #include </usr/include/math.h>
4877      36         #define PI 3.141592654
4878      37
4879      38         main()
4880      39         {
4881      40   1     double        sin_result, cos_result;
4882      41   1     double        angle_deg = 0.0, angle_rad;
4883      42   1     int           i, no_of_trial = 4;
4884      43
4885      44   1          for( i = 1; i <= no_of_trial; i++){
4886      45   2              angle_rad = angle_deg * PI / 180.0;
4887      46   2              sin_result = sin (angle_rad);
4888      47   2              cos_result = cos (angle_rad);
4889      48   2              printf("sine of %f degrees equals %f\n", angle_deg, sin_result);
4890      49   2              printf("cosine of %f degrees equals %f\n\n", angle_deg, cos_result);
4891      50   2              angle_deg = angle_deg + 30.0;
4892      51   2              }
4893      52   1     /** etc. **/
4894      53   1     }
4895
4896 C386 COMPILATION COMPLETE. 0 WARNINGS, 0 ERRORS
4897
4898
4899 5.1.3  PL/M-386
4900
4901 Programmers in PL/M-386 can access a very useful subset of the 80387's
4902 numeric capabilities. The PL/M-386 REAL data type corresponds to the NPX's
4903 single real (32-bit) format. This data type provides a range of about
4904 8.43 * 10^(-37) ¾ �X� ¾ 3.38 * 10^(38), with about seven significant decimal
4905 digits. This representation is adequate for the data manipulated by many
4906 microcomputer applications.
4907
4908 The utility of the REAL data type is extended by the PL/M-386 compiler's
4909 practice of holding intermediate results in the 80387's extended real
4910 format. This means that the full range and precision of the processor are
4911 utilized for intermediate results. Underflow, overflow, and rounding
4912 exceptions are most likely to occur during intermediate computations rather
4913 than during calculation of an expression's final result. Holding
4914 intermediate results in extended-precision real format greatly reduces the
4915 likelihood of overflow and underflow and eliminates roundoff as a serious
4916 source of error until the final assignment of the result is performed.
4917
4918 The compiler generates 80387 code to evaluate expressions that contain REAL
4919 data types, whether variables or constants or both. This means that
4920 addition, subtraction, multiplication, division, comparison, and assignment
4921 of REALs will be performed by the NPX. INTEGER expressions, on the other
4922 hand, are evaluated on the CPU.
4923
4924 Five built-in procedures (Table 5-1) give the PL/M-386 programmer access to
4925 80387 functions manipulated by the processor control instructions. Prior to
4926 any arithmetic operations, a typical PL/M-386 program will set up the NPX
4927 using the INIT$REAL$MATH$UNIT procedure and then issue SET$REAL$MODE to
4928 configure the NPX. SET$REAL$MODE loads the 80387 control word, and its
4929 16-bit parameter has the format shown for the control word in Chapter 2.
4930 The recommended value of this parameter is 033EH (round to nearest, 64-bit
4931 precision, all exceptions masked except invalid operation). Other settings
4932 may be used at the programmer's discretion.
4933
4934 If any exceptions are unmasked, an exception handler must be provided in
4935 the form of an interrupt procedure that is designated to be invoked via CPU
4936 interrupt vector number 16. The exception handler can use the GET$REAL$ERROR
4937 procedure to obtain the low-order byte of the 80387 status word and to then
4938 clear the exception flags. The byte returned by GET$REAL$ERROR contains the
4939 exception flags; these can be examined to determine the source of the
4940 exception.
4941
4942 The SAVE$REAL$STATUS and RESTORE$REAL$STATUS procedures are provided
4943 for multitasking environments where a running task that uses the 80387 may
4944 be preempted by another task that also uses the 80387. It is the
4945 responsibility of the operating system to issue SAVE$REAL$STATUS before it
4946 executes any statements that affect the 80387; these include the
4947 INIT$REAL$MATH$UNIT and SET$REAL$MODE procedures as well as arithmetic
4948 expressions. SAVE$REAL$STATUS saves the 80387 state (registers, status, and
4949 control words, etc.) on the CPU's stack. RESTORE$REAL$STATUS reloads the
4950 state information; the preempting task must invoke this procedure before
4951 terminating in order to restore the 80387 to its state at the time the
4952 running task was preempted. This enables the preempted task to resume
4953 execution from the point of its preemption.
4954
4955
4956 Table 5-1.  PL/M-386 Built-In Procedures
4957
4958 Procedure                80387         Description
4959                          Instruction
4960
4961 INIT$REAL$MATH$UNIT      FINIT         Initialize processor.
4962 SET$REAL$MODE            FLDCW         Set exception masks, rounding
4963                                        precision, and infinity controls.
4964 GET$REAL$ERROR           FNSTSW        Store, then clear, exception flags.
4965                          & FNCLEX
4966 SAVE$REAL$STATUS         FNSAVE        Save processor state.
4967 RESTORE$REAL$STATUS      FRSTOR        Restore processor state.
4968
4969
4970 5.1.4  ASM386
4971
4972 The ASM386 assembly language provides programmers with complete access to
4973 all of the facilities of the 80386 and 80387 processors.
4974
4975 The programmer's view of the 80386/80387 hardware is a single machine with
4976 these resources:
4977
4978   Ž  160 instructions
4979   Ž  12 data types
4980   Ž  8 general registers
4981   Ž  6 segment registers
4982   Ž  8 floating-point registers, organized as a stack
4983
4984
4985 5.1.4.1  Defining Data
4986
4987 The ASM386 directives shown in Table 5-2 allocate storage for 80387
4988 variables and constants. As with other storage allocation directives, the
4989 assembler associates a type with any variable defined with these directives.
4990 The type value is equal to the length of the storage unit in bytes (10 for
4991 DT, 8 for DQ, etc.). The assembler checks the type of any variable coded in
4992 an instruction to be certain that it is compatible with the instruction.
4993 For example, the coding FIADD ALPHA will be flagged as an error if ALPHA's
4994 type is not 2 or 4, because integer addition is only available for word and
4995 short integer (doubleword) data types. The operand's type also tells the
4996 assembler which machine instruction to produce; although to the programmer
4997 there is only an FIADD instruction, a different machine instruction is
4998 required for each operand type.
4999
5000 On occasion it is desirable to use an instruction with an operand that has
5001 no declared type. For example, if register BX points to a short integer
5002 variable, a programmer may want to code FIADD [BX]. This can be done by
5003 informing the assembler of the operand's type in the instruction, coding
5004 FIADD DWORD PTR [BX]. The corresponding overrides for the other storage
5005 allocations are WORD PTR, QWORD PTR, and TBYTE PTR.
5006
5007 The assembler does not, however, check the types of operands used in
5008 processor control instructions. Coding FRSTOR [BP] implies that the
5009 programmer has set up register BP to point to the location (probably in the
5010 stack) where the processor's 94-byte state record has been previously saved.
5011
5012 The initial values for 80387 constants may be coded in several different
5013 ways. Binary integer constants may be specified as bit strings, decimal
5014 integers, octal integers, or hexadecimal strings. Packed decimal values are
5015 normally written as decimal integers, although the assembler will accept and
5016 convert other representations of integers. Real values may be written as
5017 ordinary decimal real numbers (decimal point required), as decimal numbers
5018 in scientific notation, or as hexadecimal strings. Using hexadecimal strings
5019 is primarily intended for defining special values such as infinities, NaNs,
5020 and denormalized numbers. Most programmers will find that ordinary decimal
5021 and scientific decimal provide the simplest way to initialize 80387
5022 constants. Figure 5-2 compares several ways of setting the various 80387
5023 data types to the same initial value.
5024
5025 Note that preceding 80387 variables and constants with the ASM386 EVEN
5026 directive ensures that the operands will be word-aligned in memory. The best
5027 performance is obtained when data transfers are double-word aligned. All
5028 80387 data types occupy integral numbers of words so that no storage is
5029 "wasted" if blocks of variables are defined together and preceded by a
5030 single EVEN declarative.
5031
5032
5033 Table 5-2.  ASM386 Storage Allocation Directives
5034
5035 Directive   Interpretation       Data Types
5036
5037 DW          Define Word          Word integer
5038 DD          Define Doubleword    Short integer, short real
5039 DQ          Dfine Quadword       Long integer, long real
5040 DT          Define Tenbyte       Packed decimal, temporary real
5041
5042
5043 Figure 5-2.  Sample 80387 Constants
5044
5045 ; THE FOLLOWING ALL ALLOCATE THE CONSTANT: -126
5046 ; NOTE TWO'S COMPLETE STORAGE OF NEGATIVE BINARY INTEGERS.
5047 ;
5048 ; EVEN                                  ; FORCE WORD ALIGNMENT
5049 WORD_INTEGER    DW  111111111000010B    ; BIT STRING
5050 SHORT_INTEGER   DD  0FFFFFF82H          ; HEX STRING MUST START
5051                                         ; WITH DIGIT
5052 LONG_INTEGER    DQ  -126                ; ORDINARY DECIMAL
5053 SINGLE_REAL     DD  -126.0              ; NOTE PRESENCE OF '.'
5054 DOUBLE_REAL     DD  -1.26E2             ; "SCIENTIFIC"
5055 PACKED_DECIMAL  DT  -126                ; ORDINARY DECIMAL INTEGER
5056 ;
5057 ; IN THE FOLLOWING, SIGN AND EXPONENT IS 'C005'
5058 ;    SIGNIFICAND IS '7E00...00', 'R' INFORMS ASSEMBLER THAT
5059 ;    THE STRING REPRESENTS A REAL DATA TYPE.
5060 ;
5061 EXTENDED_REAL   DT  0C0057E00000000000000R  ; HEX STRING
5062
5063
5064 5.1.4.2  Records and Structures
5065
5066 The ASM386 RECORD and STRUC (structure) declaratives can be very useful in
5067 NPX programming. The record facility can be used to define the bit fields of
5068 the control, status, and tag words. Figure 5-3 shows one definition of the
5069 status word and how it might be used in a routine that polls the 80387 until
5070 it has completed an instruction.
5071
5072 Because structures allow different but related data types to be grouped
5073 together, they often provide a natural way to represent "real world" data
5074 organizations. The fact that the structure template may be "moved" about in
5075 memory adds to its flexibility. Figure 5-4 shows a simple structure that
5076 might be used to represent data consisting of a series of test score
5077 samples. A structure could also be used to define the organization of the
5078 information stored and loaded by the FSTENV and FLDENV instructions.
5079
5080
5081 Figure 5-3.  Status Word Record Definition
5082
5083 ; RESERVE SPACE FOR STATUS WORD
5084 STATUS_WORD
5085 ; LAY OUT STATUS WORD FIELDS
5086 STATUS RECORD
5087 &   BUSY:           1,
5088 &   COND_CODE3:     1,
5089 &   STACK_TOP:      3,
5090 &   COND_CODE2:     1,
5091 &   COND_CODE1:     1,
5092 &   COND_CODE0:     1,
5093 &   INT_REQ:        1,
5094 &   S_FLAG:         1,
5095 &   P_FLAG:         1,
5096 &   U_FLAG:         1,
5097 &   O_FLAG:         1,
5098 &   Z_FLAG:         1,
5099 &   D_FLAG:         1,
5100 &   I_FLAG:         1
5101 ; REDUCE UNTIL COMPLETE
5102 REDUCE: FPREM1
5103         FNSTSW  STATUS_WORD
5104         TEST    STATUS_WORD, MASK_COND_CODE2
5105         JNZ     REDUCE
5106
5107
5108 Figure 5-4.  Structure Definition
5109
5110 SAMPLE     STRUC
5111     N_OBS   DD  ?   ; SHORT INTEGER
5112     MEAN    DQ  ?   ; DOUBLE REAL
5113     MODE    DW  ?   ; WORD INTEGER
5114     STD_DEV DQ  ?   ; DOUBLE REAL
5115     ; ARRAY OF OBSERVATIONS -- WORD INTEGER
5116     TEST_SCORES DW 1000 DUP (?)
5117 SAMPLE     ENDS
5118
5119
5120 5.1.4.3  Addressing Methods
5121
5122 80387 memory data can be accessed with any of the memory addressing methods
5123 provided by the ModR/M byte and (optionally) the SIB byte. This means that
5124 80387 data types can be incorporated in data aggregates ranging from simple
5125 to complex according to the needs of the application. The addressing methods
5126 and the ASM386 notation used to specify them in instructions make the
5127 accessing of structures, arrays, arrays of structures, and other
5128 organizations direct and straightforward. Table 5-3 gives several examples
5129 of 80387 instructions coded with operands that illustrate different
5130 addressing methods.
5131
5132
5133 Table 5-3.  Addressing Method Examples
5134
5135 Coding                    Interpretation
5136
5137 FIADD ALPHA               ALPHA is a simple scalar (mode is direct).
5138
5139 FDIVR ALPHA.BETA          BETA is a field in a structure that is
5140                           "overlaid" on ALPHA (mode is direct).
5141
5142 FMUL QWORD PTR [BX]       BX contains the address of a long real
5143                           variable (mode is register indirect).
5144
5145 FSUB ALPHA [SI]           ALPHA is an array and SI contains the
5146                           offset of an array element from the start of
5147                           the array (mode is indexed).
5148
5149 FILD [BP].BETA            BP contains the address of a structure on
5150                           the CPU stack and BETA is a field in the
5151                           structure (mode is based).
5152
5153 FBLD TBYTE PTR [BX] [DI]  BX contains the address of a packed
5154                           decimal array and DI contains the offset of
5155                           an array element (mode is based indexed).
5156
5157
5158 5.1.5  Comparative Programming Example
5159
5160 Figures 5-5 and 5-6 show the PL/M-386 and ASM386 code for a simple 80387
5161 program, called ARRSUM. The program references an array (X$ARRAY), which
5162 contains 0-100 single real values; the integer variable N$OF$X indicates the
5163 number of array elements the program is to consider. ARRSUM steps through
5164 X$ARRAY accumulating three sums:
5165
5166   Ž  SUM$X, the sum of the array values
5167
5168   Ž  SUM$INDEXES, the sum of each array value times its index, where the
5169      index of the first element is 1, the second is 2, etc.
5170
5171   Ž  SUM$SQUARES, the sum of each array element squared
5172
5173 (A true program, of course, would go beyond these steps to store and use
5174 the results of these calculations.) The control word is set with the
5175 recommended values: round to nearest, 64-bit precision, interrupts enabled,
5176 and all exceptions masked except invalid operation. It is assumed that an
5177 exception handler has been written to field the invalid operation if it
5178 occurs, and that it is invoked by interrupt pointer 16. Either version of
5179 the program will run on an actual or an emulated 80387 without altering the
5180 code shown.
5181
5182 The PL/M-386 version of ARRSUM (Figure 5-5) is very straightforward and
5183 illustrates how easily the 80387 can be used in this language. After
5184 declaring variables, the program calls built-in procedures to initialize the
5185 processor (or its emulator) and to load to the control word. The program
5186 clears the sum variables and then steps through X$ARRAY with a DO-loop. The
5187 loop control takes into account PL/M-386's practice of considering the
5188 index of the first element of an array to be 0. In the computation of
5189 SUM$INDEXES, the built-in procedure FLOAT converts I+1 from integer to real
5190 because the language does not support "mixed mode" arithmetic. One of the
5191 strengths of the NPX, of course, is that it does support arithmetic on mixed
5192 data types (because all values are converted internally to the 80-bit
5193 extended-precision real format).
5194
5195 The ASM386 version (Figure 5-6) defines the external procedure INIT387,
5196 which makes the different initialization requirements of the processor and
5197 its emulator transparent to the source code. After defining the data and
5198 setting up the segment registers and stack pointer, the program calls
5199 INIT387 and loads the control word. The computation begins with the next
5200 three instructions, which clear three registers by loading (pushing) zeros
5201 onto the stack. As shown in Figure 5-7, these registers remain at the
5202 bottom of the stack throughout the computation while temporary values are
5203 pushed on and popped off the stack above them.
5204
5205 The program uses the CPU LOOP instruction to control its iteration through
5206 X_ARRAY; register ECX, which LOOP automatically decrements, is loaded with
5207 N_OF_X, the number of array elements to be summed. Register ESI is used to
5208 select (index) the array elements. The program steps through X_ARRAY from
5209 back to front, so ESI is initialized to point at the element just beyond the
5210 first element to be processed. The ASM386 TYPE operator is used to determine
5211 the number of bytes in each array element. This permits changing X_ARRAY to
5212 a double-precision real array by simply changing its definition (DD to DQ)
5213 and reassembling.
5214
5215 Figure 5-7 shows the effect of the instructions in the program loop on the
5216 NPX register stack. The figure assumes that the program is in its first
5217 iteration, that N_OF_X is 20, and that X_ARRAY(19) (the 20th element)
5218 contains the value 2.5. When the loop terminates, the three sums are left as
5219 the top stack elements so that the program ends by simply popping them into
5220 memory variables.
5221
5222
5223 Figure 5-5.  Sample PL/M-386 Program
5224
5225 XENIX286 PL/M-386 DEBUG X291a COMPILATION OF MODULE ARRAYSUM
5226 OBJECT MODULE PLACED IN arraysum.obj
5227 COMPILER INVOKED BY:  plm386 arraysum.plm
5228
5229
5230             /***********************************************************
5231             *                                                          *
5232             *                      ARRAYSUM  MODDULE                   *
5233             *                                                          *
5234             ***********************************************************/
5235
5236   1         array$sum:      do;
5237
5238   2   1        declare (sum$x, sum$indexes, sum$squares) real;
5239   3   1        declare x$array(100) real;
5240   4   1        declare (n$of$x, i) integer;
5241   5   1        declare control$387 literally `033eh';
5242
5243                /* Assume x$array and n$of$x are initialized */
5244   6   1        call init$real$math$unit;
5245   7   1        call set$real$mode(control$387);
5246
5247                /* Clear sums */
5248   8   1        sum$x, sum$indexes, sum$squares = 0.0;
5249
5250                /* Loop through array, accumulating sums */
5251   9   1        do i = 0 to n$of$x - 1;
5252  10   2             sum$x = sum$x + x$array(i);
5253  11   2             sum$indexes = sum$indexes + (x$array(i)*float(i+1));
5254  12   2             sum$squares = sum$squares + (x$array(i)*x$array(i));
5255  13   2        end;
5256
5257                /* etc. */
5258
5259  14   1     end array$sum;
5260
5261
5262  MODULE INFORMATION:
5263
5264    CODE AREA SIZE      = 000000A0H       160D
5265    CONSTANT AREA SIZE  = 00000004H         4D
5266    VARIABLE AREA SIZE  = 000001A4H       420D
5267    MAXIMUM  STACK SIZE = 00000004H         4D
5268    32 LINES READ
5269    0 PROGRAM WARNINGS
5270    0 PROGRAM ERRORS
5271
5272  DICTIONARY SUMMARY:
5273
5274    8KB MEMORY USED
5275    0KB DISK SPACE USED
5276
5277  END OF PL/M-386 COMPILATION
5278
5279
5280 Figure 5-6.  Sample ASM386 Program
5281
5282 XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE ARRAYSUM
5283 OBJECT MODULE PLACED IN arraysum.obj
5284 ASSEMBLER INVOKED BY: asm386 arraysum.asm
5285
5286 LOC       OBJ                         LINE    SOURCE
5287
5288                                1    name       arraysum
5289                                2
5290                                3    ; Define initialization routine
5291                                4
5292                                5    extrn      init387:far
5293                                6
5294                                7    ; Allocate space for data
5295                                8
5296 --------                       9    data       segment rw public
5297 00000000  3E03                10    control_387        dw 033eh
5298 00000002  ????????            11    n_of_x             dd ?
5299 00000006  (100                12    x_array            cd 100 dup (?)
5300           ????????
5301           )
5302 00000196  ????????            13    sum_squares        dd ?
5303 0000019A  ????????            14    sum_indexes        dd ?
5304 0000019E  ????????            15    sum_x              dd ?
5305 --------                      16    data       ends
5306                               17
5307                               18    ; Allocate CPU stack space
5308                               19
5309 --------                      20    stack      stackseg   400
5310                               21
5311                               22    ; Begin code
5312                               23
5313 --------                      24    code       segment er public
5314                               25
5315                               26    assume  ds:data, ss:stack
5316                               27
5317 00000000                      28    start:
5318 00000000  66B8----        R   29         mov     ax, data
5319 00000004  8ED8                30         mov     ds, ax
5320 00000006  66B8----        R   31         mov     ax, stack
5321 0000000A  B800000000          32         mov     eax, 0h
5322 0000000F  8E00                33         mov     ss, ax
5323 00000011  BC00000000      R   34         mov     esp, stackstart stack
5324                               35
5325                               36    ; Assume x_array and n_of_x have
5326                               37    ; been initialized
5327                               38
5328                               39    ; Prepare the 80387 or its emulator
5329                               40
5330 00000016  9A00000000----  E   41         call    init387
5331 0000001D  D92D00000000    R   42         fldcw   control_387
5332                               43
5333                               44    ; Clear three registers to hold
5334                               45    ; running sums
5335                               46
5336 00000023  D9EE                47         fldz
5337 00000025  D9EE                48         fldz
5338 00000027  D9EE                49         fldz
5339                               50
5340                               51    ; Setup ECX as loop counter and ESI
5341                               52    ; as index into x array
5342                               53
5343 00000029  8B0D02000000    R   54         mov     ecx, n of x
5344 0000002F  F7E9                55         imul    ecx
5345 00000031  8BF0                56         mov     esi, eax
5346                               57
5347                               58    ; ESI now contains index of last
5348                               59    ; element + 1
5349                               60    ; Loop through x_array and
5350                               61    ; accumulate sum
5351                               62
5352 00000033                      43    sum_next:
5353                               64    ; backup one element and push on
5354                               65    ; the stack
5355                               66
5356 00000033  83EE04              67         sub     esi,  type x_array
5357 00000036  D98606000000    R   68         fld     x_array[esi]
5358                               69
5359                               70    ; add to the sum and duplicate x
5360                               71    ; on the stack
5361                               72
5362 0000003C  DCC3                73         fadd    st(3), st
5363 0000003E  D9C0                74         fld     st
5364                               75
5365                               76    ; square it and add into the sum of
5366                               77    ; (index+1) and discard
5367                               78
5368 00000040  DCC8                79         fmul    st, st
5369 00000042  DEC2                80         facdp   st(2), st
5370                               81
5371                               82    ; reduce index for next iteration
5372                               83
5373 00000044  FF0D02000000    R   84         dec     n_of_x
5374 0000004A  E2E7                85         loop    sum_next
5375                               86
5376                               87    ; Pop sums into memory
5377                               88
5378 0000004C                      89    pop_results:
5379 0000004C  D91D96010000    R   90         fstp    sum_squares
5380 00000052  D91D9A010000    R   91         fstp    sum_indexes
5381 00000058  D91D9E010000    R   92         fstp    sum_x
5382 0000005E  9B                  93         fwait
5383                               94
5384                               95    ;
5385                               96    ; Etc.
5386                               97    ;
5387 --------                      98    code       ends
5388                               99    end     start, ds:data, ss:stack
5389
5390 ASSEMBLY COMPLETE,    NO WARNINGS,    NO ERRORS.
5391
5392
5393 Figure 5-7.  Instructions and Register Stack
5394
5395          FLDZ, FLDZ, FLDZ                       FLD X_ARRAY[SI]
5396          ‚��������������ƒ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘\x10‚��������������ƒ
5397     ST(0)€      0.0     € SUM_SQUARES     ST(O)€      2.5     € X_ARRAY(19)
5398          †��������������‡                      †��������������‡
5399     ST(1)€      0.0     € SUM_INDEXES     ST(1)€              € SUM_SQUARES
5400          †��������������‡                      †��������������‡
5401     ST(2)€      0.0     € SUM_X           ST(2)€      0.0     € SUM_INDEXES
5402          „��������������…                      †��������������‡
5403                                           ST(3)€      0.0     € SUM_X
5404                            ’ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ „��������������…
5405                            �
5406           FADD_ST(3), ST \x11‘•                        FLD_ST
5407          ‚��������������ƒ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘\x10‚��������������ƒ
5408     ST(O)€      2.5     € X_ARRAY(19)     ST(O)€      2.5     € X_ARRAY(19)
5409          †��������������‡                      †��������������‡
5410     ST(1)€      0.0     € SUM_SQUARES     ST(1)€      2.5     € X_ARRAY(19)
5411          †��������������‡                      †��������������‡
5412     ST(2)€      0.0     € SUM_INDEXES     ST(2)€      0.0     € SUM_SQUARES
5413          †��������������‡                      †��������������‡
5414     ST(3)€      2.5     € SUM_X           ST(3)€      0.0     € SUM_INDEXES
5415          „��������������…                      †��������������‡
5416                                           ST(4)€      2.5     € SUM_X
5417                            ’ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ „��������������…
5418                            �
5419            FMUL_ST, ST  \x11‘‘•                    FADDP_ST(2), ST
5420          ‚��������������ƒ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘\x10‚��������������ƒ
5421     ST(0)€      6.25    € X_ARRAY(19)²    ST(O)€      2.5     € X_ARRAY(19)
5422          †��������������‡                      †��������������‡
5423     ST(1)€      2.5     € X_ARRAY(19)     ST(1)€      6.25    € SUM_SQUARES
5424          †��������������‡                      †��������������‡
5425     ST(2)€      0.0     € SUM_SQUARES     ST(2)€      0.0     € SUM_INDEXES
5426          †��������������‡                      †��������������‡
5427     ST(3)€      0.0     € SUM_INDEXES     ST(3)€      2.5     € SUM_X
5428          †��������������‡                      „��������������…
5429     ST(4)€      2.5     € SUM_X                �
5430          „��������������…                      �
5431                            ’ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ •
5432            FIMUL N_OF_X \x11‘‘•                    FADDP_ST(2), ST
5433          ‚��������������ƒ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘ ‘\x10‚��������������ƒ
5434     ST(O)€      50.0    € X_ARRAY(19)*20  ST(O)€      6.25    € SUM_SQUARES
5435          †��������������‡                      †��������������‡
5436     ST(1)€      6.25    € SUM_SQUARES     ST(1)€      50.0    € SUM_INDEXES
5437          †��������������‡                      †��������������‡
5438     ST(2)€      0.0     € SUM_INDEXES     ST(2)€      2.5     € SUM_X
5439          †��������������‡                      „��������������…
5440     ST(3)€      2.5     € SUM_X
5441          „��������������…
5442
5443
5444 5.1.6  80387 Emulation
5445
5446 The programming of applications to execute on both 80386 with an 80387 and
5447 80386 systems without an 80387 is made much easier by the existence of an
5448 80387 emulator for 80386 systems. The Intel EMUL387 emulator offers a
5449 complete software counterpart to the 80387 hardware; NPX instructions can be
5450 simply emulated in software rather than being executed in hardware. With
5451 software emulation, the distinction between 80386 systems with or without an
5452 80387 is reduced to a simple performance differential. Identical numeric
5453 programs will simply execute more slowly (using software emulation of NPX
5454 instructions) on 80386 systems without an 80387 than on an 80386/80387
5455 system executing NPX instructions directly.
5456
5457 When incorporated into the systems software, the emulation of NPX
5458 instructions on the 80386 systems is completely transparent to the
5459 applications programmer. Applications software needs no special libraries,
5460 linking, or other activity to allow it to run on an 80386 with 80387
5461 emulation.
5462
5463 To the applications programmer, the development of programs for 80386
5464 systems is the same whether the 80387 NPX hardware is available or not. The
5465 full 80387 instruction set is available for use, with NPX instructions being
5466 either emulated or executed directly. Applications programmers need not be
5467 concerned with the hardware configuration of the computer systems on which
5468 their applications will eventually run.
5469
5470 For systems programmers, details relating to 80387 emulators are described
5471 in Chapter 6.
5472
5473 The EMUL387 software emulator for 80386 systems is available from Intel as
5474 a separate program product.
5475
5476
5477 5.2  Concurrent Processing With the 80387
5478
5479 Because the 80386 CPU and the 80387 NPX have separate execution units, it
5480 is possible for the NPX to execute numeric instructions in parallel with
5481 instructions executed by the CPU. This simultaneous execution of different
5482 instructions is called concurrency.
5483
5484 No special programming techniques are required to gain the advantages of
5485 concurrent execution; numeric instructions for the NPX are simply placed in
5486 line with the instructions for the CPU. CPU and numeric instructions are
5487 initiated in the same order as they are encountered by the CPU in its
5488 instruction stream. However, because numeric operations performed by the NPX
5489 generally require more time than operations performed by the CPU, the CPU
5490 can often execute several of its instructions before the NPX completes a
5491 numeric instruction previously initiated.
5492
5493 This concurrency offers obvious advantages in terms of execution
5494 performance, but concurrency also imposes several rules that must be
5495 observed in order to assure proper synchronization of the 80386 CPU and
5496 80387 NPX.
5497
5498 All Intel high-level languages automatically provide for and manage
5499 concurrency in the NPX. Assembly-language programmers, however, must
5500 understand and manage some areas of concurrency in exchange for the
5501 flexibility and performance of programming in assembly language. This
5502 section is for the assembly-language programmer or well-informed
5503 high-level-language programmer.
5504
5505
5506 5.2.1  Managing Concurrency
5507
5508 Concurrent execution of the host and 80387 is easy to establish and
5509 maintain. The activities of numeric programs can be split into two major
5510 areas: program control and arithmetic. The program control part performs
5511 activities such as deciding what functions to perform, calculating addresses
5512 of numeric operands, and loop control. The arithmetic part simply adds,
5513 subtracts, multiplies, and performs other operations on the numeric
5514 operands. The NPX and host are designed to handle these two parts separately
5515 and efficiently.
5516
5517 Concurrency management is required to check for an exception before letting
5518 the 80386 change a value just used by the 80387. Almost any numeric
5519 instruction can, under the wrong circumstances, produce a numeric exception.
5520 For programmers in higher-level languages, all required synchronization is
5521 automatically provided by the appropriate compiler. For assembly-language
5522 programmers exception synchronization remains the responsibility of the
5523 assembly-language programmer.
5524
5525 A complication is that a programmer may not expect his numeric program to
5526 cause numeric exceptions, but in some systems, they may regularly happen. To
5527 better understand these points, consider what can happen when the NPX
5528 detects an exception.
5529
5530 Depending on options determined by the software system designer, the NPX
5531 can perform one of two things when a numeric exception occurs:
5532
5533   Ž  The NPX can provide a default fix-up for selected numeric exceptions.
5534      Programs can mask individual exception types to indicate that the NPX
5535      should generate a safe, reasonable result whenever that exception
5536      occurs. The default exception fix-up activity is treated by the NPX as
5537      part of the instruction causing the exception; no external indication
5538      of the exception is given. When exceptions are detected, a flag is set
5539      in the numeric status register, but no information regarding where or
5540      when is available. If the NPX performs its default action for all
5541      exceptions, then the need for exception synchronization is not
5542      manifest. However, as will be shown later, this is not sufficient
5543      reason to ignore exception synchronization when designing programs that
5544      use the 80387.
5545
5546   Ž  As an alternative to the NPX default fix-up of numeric exceptions, the
5547      80386 CPU can be notified whenever an exception occurs. When a numeric
5548      exception is unmasked and the exception occurs, the NPX stops further
5549      execution of the numeric instruction and signals this event to the CPU.
5550      On the next occurrence of an ESC or WAIT instruction, the CPU traps to
5551      a software exception handler. The exception handler can then implement
5552      any sort of recovery procedures desired for any numeric exception
5553      detectable by the NPX. Some ESC instructions do not check for
5554      exceptions. These are the nonwaiting forms FNINIT, FNSTENV, FNSAVE,
5555      FNSTSW, FNSTCW, and FNCLEX.
5556
5557 When the NPX signals an unmasked exception condition, it is requesting
5558 help. The fact that the exception was unmasked indicates that further
5559 numeric program execution under the arithmetic and programming rules of the
5560 NPX is unreasonable.
5561
5562 If concurrent execution is allowed, the state of the CPU when it recognizes
5563 the exception is undefined. The CPU may have changed many of its internal
5564 registers and be executing a totally different program by the time the
5565 exception occurs. To handle this situation, the NPX has special registers
5566 updated at the start of each numeric instruction to describe the state of
5567 the numeric program when the failed instruction was attempted.
5568
5569 Exception synchronization ensures that the NPX is in a well-defined state
5570 after an unmasked numeric exception occurs. Without a well-defined state, it
5571 would be impossible for exception recovery routines to determine why the
5572 numeric exception occurred, or to recover successfully from the exception.
5573
5574 The following two sections illustrate the need to always consider
5575 exception synchronization when writing 80387 code, even when the code is
5576 initially intended for execution with exceptions masked. If the code is
5577 later moved to an environment where exceptions are unmasked, the same code
5578 may not work correctly. An example of how some instructions written without
5579 exception synchronization will work initially, but fail when moved into a
5580 new environment is shown in Figure 5-8.
5581
5582
5583 Figure 5-8.  Exception Synchronization Examples
5584
5585 INCORRECT ERROR SYNCHRONIZATION
5586
5587 FILD   COUNT  ; NPX instruction
5588 INC    COUNT  ; CPU instruction alters operand
5589 FSQRT  COUNT  ; subsequent NPX instruction -- error from
5590               ;    previous NPX instruction detected here
5591
5592 PROPER ERROR SYNCHRONIZATION
5593
5594 FILD   COUNT  ; NPX instruction
5595 FSQRT         ; subsequent NPX instruction -- error from
5596               ;    previous NPX instruction detected here
5597 INC    COUNT  ; CPU instruction alters operand
5598
5599
5600 5.2.1.1  Incorrect Exception Synchronization
5601
5602 In Figure 5-8, three instructions are shown to load an integer, calculate
5603 its square root, then increment the integer. The 80386-to-80387 interface
5604 and synchronous execution of the NPX emulator will allow this program to
5605 execute correctly when no exceptions occur on the FILD instruction.
5606
5607 This situation changes if the 80387 numeric register stack is extended to
5608 memory. To extend the NPX stack to memory, the invalid exception is
5609 unmasked. A push to a full register or pop from an empty register sets SF
5610 and causes an invalid exception.
5611
5612 The recovery routine for the exception must recognize this situation, fix
5613 up the stack, then perform the original operation.  The recovery routine
5614 will not work correctly in the first example shown in the figure. The
5615 problem is that the value of COUNT is incremented before the NPX can signal
5616 the exception to the CPU. Because COUNT is incremented before the exception
5617 handler is invoked, the recovery routine will load an incorrect value of
5618 COUNT, causing the program to fail or behave unreliably.
5619
5620
5621 5.2.1.2  Proper Exception Synchronization
5622
5623 Exception synchronization relies on the WAIT instruction and the BUSY# and
5624 ERROR# signals of the 80387. When an unmasked exception occurs in the 80387,
5625 it asserts the ERROR# signal, signaling to the CPU that a numeric exception
5626 has occurred. The next time the CPU encounters a WAIT instruction or an
5627 exception-checking ESC instruction, the CPU acknowledges the ERROR# signal
5628 by trapping automatically to Interrupt #16, the processor-extension
5629 exception vector. If the following ESC or WAIT instruction is properly
5630 placed, the CPU will not yet have disturbed any information vital to
5631 recovery from the exception.
5632
5633
5634 Chapter 6  System-Level Numeric Programming
5635
5636 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
5637
5638 System programming for 80387 systems requires a more detailed understanding
5639 of the 80387 NPX than does application programming. Such things as
5640 emulation, initialization, exception handling, and data and error
5641 synchronization are all the responsibility of the systems programmer. These
5642 topics are covered in detail in the sections that follow.
5643
5644
5645 6.1  80386/80387 Architecture
5646
5647 On a software level, the 80387 NPX appears as an extension of the 80386
5648 CPU. On the hardware level, however, the mechanisms by which the 80386 and
5649 80387 interact are more complex. This section describes how the 80387 NPX
5650 and 80386 CPU interact and points out features of this interaction that are
5651 of interest to systems programmers.
5652
5653
5654 6.1.1  Instruction and Operand Transfer
5655
5656 All transfers of instructions and operands between the 80387 and system
5657 memory are performed by the 80386 using I/O bus cycles. The 80387 appears to
5658 the CPU as a special peripheral device. It is special in two respects: the
5659 CPU initiates I/O automatically when it encounters ESC instructions, and the
5660 CPU uses reserved I/O addresses to communicate with the 80387. These I/O
5661 operations are completely transparent to software.
5662
5663 Because the 80386 actually performs all transfers between the 80387 and
5664 memory, no additional bus drivers, controllers, or other components are
5665 necessary to interface the 80387 NPX to the local bus. The 80387 can utilize
5666 instructions and operands located in any memory accessible to the 80386 CPU.
5667
5668
5669 6.1.2  Independent of CPU Addressing Modes
5670
5671 Unlike the 80287, the 80387 is not sensitive to the addressing and memory
5672 management of the CPU. The 80387 operates the same regardless of whether the
5673 80386 CPU is operating in real-address mode, in protected mode, or in
5674 virtual 8086 mode.
5675
5676 The instruction FSETPM that was necessary in 80286/80287 systems to set the
5677 80287 into protected mode is not needed for the 80387. The 80387 treats this
5678 instruction as a no-op.
5679
5680 Because the 80386 actually performs all transfers between the 80387 and
5681 memory, 80387 instructions can utilize any memory location accessible by the
5682 task currently executing on the 80386. When operating in protected mode, all
5683 references to memory operands are automatically verified by the 80386's
5684 memory management and protection mechanisms as for any other memory
5685 references by the currently-executing task. Protection violations associated
5686 with NPX instructions automatically cause the 80386 to trap to an
5687 appropriate exception handler.
5688
5689 To the numerics programmer, the operating modes of the 80386 affect only
5690 the manner in which the NPX instruction and data pointers are represented in
5691 memory following an FSAVE or FSTENV instruction. Each of these instructions
5692 produces one of four formats depending on both the operating mode and on the
5693 operand-size attribute in effect for the instruction. The differences are
5694 detailed in the discussion of the FSAVE and FSTENV instructions in
5695 Chapter 4.
5696
5697
5698 6.1.3  Dedicated I/O Locations
5699
5700 The 80387 NPX does not require that any memory addresses be set aside for
5701 special purposes. The 80387 does make use of I/O port addresses, but these
5702 are 32-bit addresses with the high-order bit set (i.e. > 80000000H);
5703 therefore, these I/O operations are completely transparent to the 80386
5704 software. Because these addresses are beyond the 64 Kbyte I/O addressing
5705 limit of I/O instructions, 80386 programs cannot reference these reserved
5706 I/O addresses directly.
5707
5708
5709 6.2  Processor Initialization and Control
5710
5711 One of the principal responsibilities of systems software is the
5712 initialization, monitoring, and control of the hardware and software
5713 resources of the system, including the 80387 NPX. In this section, issues
5714 related to system initialization and control are described, including
5715 recognition of the NPX, emulation of the 80387 NPX in software if the
5716 hardware is not available, and the handling of exceptions that may occur
5717 during the execution of the 80387.
5718
5719
5720 6.2.1  System Initialization
5721
5722 During initialization of an 80386 system, systems software must
5723
5724   Ž  Recognize the presence or absence of the NPX.
5725
5726   Ž  Set flags in the 80386 MSW to reflect the state of the numeric
5727      environment.
5728
5729 If an 80387 NPX is present in the system, the NPX must be initialized. All
5730 of these activities can be quickly and easily performed as part of the
5731 overall system initialization.
5732
5733
5734 6.2.2  Hardware Recognition of the NPX
5735
5736 The 80386 identifies the type of its coprocessor (80287 or 80387) by
5737 sampling its ERROR# input some time after the falling edge of RESET and
5738 before executing the first ESC instruction. The 80287 keeps its ERROR#
5739 output in inactive state after hardware reset; the 80387 keeps its ERROR#
5740 output in active state after hardware reset. The 80386 records this
5741 difference in the ET bit of control register zero (CR0). The 80386
5742 subsequently uses ET to control its interface with the coprocessor. If ET is
5743 set, it employs the 32-bit protocol of the 80387; if ET is not set, it
5744 employs the 16-bit protocol of the 80287.
5745
5746 Systems software can (if necessary) change the value of ET. There are three
5747 reasons that ET may not be set:
5748
5749   1.  An 80287 is actually present.
5750
5751   2.  No coprocessor is present.
5752
5753   3.  An 80387 is present but it is connected in a nonstandard manner that
5754       does not trigger the setting of ET.
5755
5756 An example of case three is the PC/AT-compatible design described in
5757 Appendix F. In such cases, initialization software may need to change the
5758 value of ET.
5759
5760
5761 6.2.3  Software Recognition of the NPX
5762
5763 Figure 6-1 shows an example of a recognition routine that determines
5764 whether an NPX is present, and distinguishes between the 80387 and the
5765 8087/80287. This routine can be executed on any 80386, 80286, or 8086
5766 hardware configuration that has an NPX socket.
5767
5768 The example guards against the possibility of accidentally reading an
5769 expected value from a floating data bus when no NPX is present. Data read
5770 from a floating bus is undefined. By expecting to read a specific bit
5771 pattern from the NPX, the routine protects itself from the indeterminate
5772 state of the bus. The example also avoids depending on any values in
5773 reserved bits, thereby maintaining compatibility with future numerics
5774 coprocessors.
5775
5776
5777 Figure 6-1.  Software Routine to Recognize the 80287
5778
5779 8086/87/88/186 MACRO ASSEMBLER  Test for presence of a Numerics Chip, Revision 1.0
5780
5781
5782 DOS 3.20 (033-N) 8086/87/88/186 MACRO ASSEMBLER V2.0 ASSEMBLY OF MODULE TEST_NPX
5783 OBJECT MODULE PLACED IN FINDNPX.OBJ
5784
5785 LOC   OBJ         LINE    SOURCE
5786
5787                      1 +1 $title('Test for presence of a Numerics Chip, Revision 1.0')
5788                      2
5789                      3            name    Test_NPX
5790                      4
5791 ----                 5    stack   segment stack 'stack'
5792 0000  (100           6            dw      100 dup (?)
5793       ????
5794       )
5795 00C8  ????           7    sst     dw              ?
5796 ----                 8    stack   ends
5797                      9
5798 ----                10    data    segment public 'data'
5799 0000  0000          11    temp    dw      0h
5800 ----                12    data    ends
5801                     13
5802                     14    dgroup group    data, stack
5803                     15    cgroup group    code
5804                     16
5805 ----                17    code    segment public 'code'
5806                     18            assume cs:cgroup, ds:dgroup
5807                     19
5808 0000                20    start:
5809                     21    ;
5810                     22    ;       Look for an 8087, 80287, or 80387 NPX.
5811                     23    ;       Note that we cannot execute WAIT on 8086/88 if no 8087 is present.
5812                     24    ;
5813 0000                25    test npx:
5814 0000  90DBE3        26            fninit                  ; Must use non-wait form
5815 0003  BE0000     R  27            mov     [si],offset dgroup:temp
5816 0006  C7045A5A      28            mov     word ptr [si],5A5AH ; Initialize temp to non-zero value
5817 000A  90DD3C        29            fnstsw  [si]            ; Must use non-wait form of fstsw
5818                     30                                    ; It is not necessary to use a WAIT instruction
5819                     31                                    ;  after fnstsw or fnstcw.  Do not use one here.
5820 000D  803C00        32            cmp     byte ptr [si],0 ; See if correct status with zeroes was read
5821 0010  752A          33            jne     no_npx          ; Jump if not a valid status word, meaning no NPX
5822                     34    ;
5823                     35    ;       Now see if ones can be correctly written from the control word.
5824                     36    ;
5825 0012  90D93C        37            fnstcw  [si]            ; Look at the control word; do not use WAIT form
5826                     38                                    ; Do not use a WAIT instruction here!
5827 0015  8B04          39            mov     ax,[si]         ; See if ones can be written by NPX
5828 0017  253F10        40            and     ax,103fh        ; See if selected parts of control word look OK
5829 001A  3D3F00        41            cmp     ax,3fh          ; Check that ones and zeroes were correctly read
5830 001D  7510          42            jne     no npx          ; Jump if no NPX is installed
5831                     43    ;
5832                     44    ;       Some numerics chip is installed.  NPX instructions and WAIT are now safe.
5833                     45    ;       See if the NPX is an 8087, 80287, or 80387.
5834                     46    ;       This code is necessary if a denormal exception handler is used or the
5835                     47    ;       new 80387 instructions will be used.
5836                     48    ;
5837 001F 98D9E8         49            fld1                    ; Must use default control word from FNINIT
5838 0022 9BD9EE         50            fldz                    ; Form infinity
5839 0025 9BDEF9         51            fdiv                    ; 8087/287 says +inf = .inf
5840 0028 9BD9C0         52            fld     st              ; Form negative infinity
5841 002B 9BD9E0         53            fchs                    ; 80387 says +inf <> -inf
5842 002E 9BDED9         54            fcompp                  ; See if they are the same and remove them
5843 0031 9BDD3C         55            fstsw   [si]            ; Look at status from FCOMPP
5844 0034 8B04           56            mov     ax,[si]
5845 0036 9E             57            sahf                    ; See if the infinities matched
5846 0037 7406           58            je      found_87_287    ; Jump if 8087/287 is present
5847                     59    ;
5848                     60    ;       An 80387 is present.  If denormal exceptions are used for an 8087/287,
5849                     61    ;       they must be masked.  The 80387 will automatically normalize denormal
5850                     62    ;       operands faster than an exception handler can.
5851                     63    ;
5852 0039 EB0790         64            jmp     found_387
5853 003C                65    no_npx:
5854                     66    ;       set up for no NPX
5855                     67    ;               ...
5856                     68    ;
5857 003C EB0490         69            jmp exit
5858 003F                70    found_87_287:
5859                     71    ;       set up for 87/287
5860                     72    ;               ...
5861                     73    ;
5862 003F EB0190         74            jmp exit
5863 0042                75    found_387:
5864                     76    ;       set up for 387
5865                     77    ;               ...
5866                     78    ;
5867 0042                79    exit:
5868 ----                80    code    ends
5869                     81            end     start,ds:dgroup,ss:dgroup:sst
5870
5871 ASSEMBLY COMPLETE, NO ERRORS FOUND
5872
5873
5874 6.2.4  Configuring the Numerics Environment
5875
5876 Once the 80386 CPU has determined the presence or absence of the 80387 or
5877 80287 NPX, the 80386 must set either the MP or the EM bit in its own control
5878 register zero (CR0) accordingly. The initialization routine can either
5879
5880   Ž  Set the MP bit in CR0 to allow numeric instructions to be executed
5881      directly by the NPX.
5882
5883   Ž  Set the EM bit in the CR0 to permit software emulation of the numeric
5884      instructions.
5885
5886 The MP (monitor coprocessor) flag of CR0 indicates to the 80386 whether an
5887 NPX is physically available in the system. The MP flag controls the function
5888 of the WAIT instruction. When executing a WAIT instruction, the 80386 tests
5889 the task switched (TS) bit only if MP is set; if it finds TS set under these
5890 conditions, the CPU traps to exception #7.
5891
5892 The Emulation Mode (EM) bit of CR0 indicates to the 80386 whether NPX
5893 functions are to be emulated. If the CPU finds EM set when it executes an
5894 ESC instruction, program control is automatically trapped to exception #7,
5895 giving the exception handler the opportunity to emulate the functions of an
5896 80387.
5897
5898 For correct 80386 operation, the EM bit must never be set concurrently with
5899 MP. The EM and MP bits of the 80386 are described in more detail in the
5900 80386 Programmer's Reference Manual. More information on software
5901 emulation for the 80387 NPX is described in the "80387 Emulation" section
5902 later in this chapter. In any case, if ESC instructions are to be executed,
5903 either the MP or EM bit must be set, but not both.
5904
5905
5906 6.2.5  Initializing the 80387
5907
5908 Initializing the 80387 NPX simply means placing the NPX in a known state
5909 unaffected by any activity performed earlier. A single FNINIT instruction
5910 performs this initialization. All the error masks are set, all registers are
5911 tagged empty, TOP is set to zero, and default rounding and precision
5912 controls are set. Table 6-1 shows the state of the 80387 NPX following
5913 FINIT or FNINIT. This state is compatible with that of the 80287 after
5914 FINIT or after hardware RESET.
5915
5916 The FNINIT instruction does not leave the 80387 in the same state as that
5917 which results from the hardware RESET signal. Following a hardware RESET
5918 signal, such as after initial power-up, the state of the 80387 differs in
5919 the following respects:
5920
5921   1.  The mask bit for the invalid-operation exception is reset.
5922
5923   2.  The invalid-operation exception flag is set.
5924
5925   3.  The exception-summary bit is set (along with its mirror image, the
5926       B-bit).
5927
5928 These settings cause assertion of the ERROR# signal as described
5929 previously. The FNINIT instruction must be used to change the 80387 state to
5930 one compatible with the 80287.
5931
5932
5933 Table 6-1.  NPX Processor State Following Initialization
5934
5935 Field                   Value               Interpretation
5936
5937 Control Word
5938    (Infinity Control)
5939 The 80387 does not have infinity control. This value is listed to emphasize
5940 that programs written for the 80287 may not behave the same on the 80387 if
5941 they depend on this bit.    0                 Affine
5942    Rounding Control       00                Round to nearest
5943    Precision Control      11                64 bits
5944    Exception Masks        111111            All exceptions masked
5945 Status Word
5946    (Busy)                 0                 ‘‘
5947    Condition Code         0000              ‘‘
5948    Stack Top              000               Register 0 is stack top
5949    Exception Summary      0                 No exceptions
5950    Stack Flag             0                 ‘‘
5951    Exception Flags        000000            No exceptions
5952 Tag Word
5953    Tags                   11                Empty
5954    Registers              N.C.              Not changed
5955 Exception Pointers
5956    Instruction Code       N.C.              Not changed
5957    Instruction Address    N.C.              Not changed
5958    Operand Address        N.C.              Not changed
5959
5960
5961 6.2.6  80387 Emulation
5962
5963 If it is determined that no 80387 NPX is available in the system, systems
5964 software may decide to emulate ESC instructions in software. This emulation
5965 is easily supported by the 80386 hardware, because the 80386 can be
5966 configured to trap to a software emulation routine whenever it encounters an
5967 ESC instruction in its instruction stream.
5968
5969 Whenever the 80386 CPU encounters an ESC instruction, and its MP and EM
5970 status bits are set appropriately (MP=0, EM=1), the 80386 automatically
5971 traps to interrupt #7, the "processor extension not available" exception.
5972 The return link stored on the stack points to the first byte of the ESC
5973 instruction, including the prefix byte(s), if any. The exception handler can
5974 use this return link to examine the ESC instruction and proceed to emulate
5975 the numeric instruction in software.
5976
5977 The emulator must step the return pointer so that, upon return from the
5978 exception handler, execution can resume at the first instruction following
5979 the ESC instruction.
5980
5981 To an application program, execution on an 80386 system with 80387
5982 emulation is almost indistinguishable from execution on a system with an
5983 80387, except for the difference in execution speeds.
5984
5985 There are several important considerations when using emulation on an 80386
5986 system:
5987
5988   Ž  When operating in protected mode, numeric applications using the
5989      emulator must be executed in execute-readable code segments. Numeric
5990      software cannot be emulated if it is executed in execute-only code
5991      segments. This is because the emulator must be able to examine the
5992      particular numeric instruction that caused the emulation trap.
5993
5994   Ž  Only privileged tasks can place the 80386 in emulation mode. The
5995      instructions necessary to place the 80386 in emulation mode are
5996      privileged instructions, and are not typically accessible to an
5997      application.
5998
5999 An emulator package (EMUL387) that runs on 80386 systems is available from
6000 Intel. This emulation package operates in both real and protected mode as
6001 well as in virtual 8086 mode, providing a complete functional equivalent for
6002 the 80387 emulated in software.
6003
6004 When using the EMUL387 emulator, writers of numeric exception handlers
6005 should be aware of one slight difference between the emulated 80387 and the
6006 80387 hardware:
6007
6008   Ž  On the 80387 hardware, exception handlers are invoked by the 80386 at
6009      the first WAIT or ESC instruction following the instruction causing the
6010      exception. The return link, stored on the 80386 stack, points to this
6011      second WAIT or ESC instruction where execution will resume following a
6012      return from the exception handler.
6013
6014   Ž  Using the EMUL387 emulator, numeric exception handlers are invoked
6015      from within the emulator itself. The return link stored on the stack
6016      when the exception handler is invoked will therefore point back to the
6017      EMUL387 emulator, rather than to the program code actually being
6018      executed (emulated). An IRET return from the exception handler returns
6019      to the emulator, which then returns immediately to the emulated
6020      program. This added layer of indirection should not cause confusion,
6021      however, because the instruction causing the exception can always be
6022      identified from the 80387's instruction and data pointers.
6023
6024
6025 6.2.7  Handling Numerics Exceptions
6026
6027 Once the 80387 has been initialized and normal execution of applications
6028 has been commenced, the 80387 NPX may occasionally require attention in
6029 order to recover from numeric processing exceptions. This section provides
6030 details for writing software exception handlers for numeric exceptions.
6031 Numeric processing exceptions have already been introduced in Chapter 3.
6032
6033 The 80387 NPX can take one of two actions when it recognizes a numeric
6034 exception:
6035
6036   Ž  If the exception is masked, the NPX will automatically perform its own
6037      masked exception response, correcting the exception condition according
6038      to fixed rules, and then continuing with its instruction execution.
6039
6040   Ž  If the exception is unmasked, the NPX signals the exception to the
6041      80386 CPU using the ERROR# status line between the two processors. Each
6042      time the 80386 encounters an ESC or WAIT instruction in its instruction
6043      stream, the CPU checks the condition of this ERROR# status line. If
6044      ERROR# is active, the CPU automatically traps to Interrupt vector #16,
6045      the Processor Extension Error trap.
6046
6047 Interrupt vector #16 typically points to a software exception handler,
6048 which may or may not be a part of systems software. This exception handler
6049 takes the form of an 80386 interrupt procedure.
6050
6051 When handling numeric errors, the CPU has two responsibilities:
6052
6053   Ž  The CPU must not disturb the numeric context when an error is
6054      detected.
6055
6056   Ž  The CPU must clear the error and attempt recovery from the error.
6057
6058 Although the manner in which programmers may treat these responsibilities
6059 varies from one implementation to the next, most exception handlers will
6060 include these basic steps:
6061
6062   Ž  Store the NPX environment (control, status, and tag words, operand and
6063      instruction pointers) as it existed at the time of the exception.
6064
6065   Ž  Clear the exception bits in the status word.
6066
6067   Ž  Enable interrupts on the CPU.
6068
6069   Ž  Identify the exception by examining the status and control words in
6070      the saved environment.
6071
6072   Ž  Take some system-dependent action to rectify the exception.
6073
6074   Ž  Return to the interrupted program and resume normal execution.
6075
6076
6077 6.2.8  Simultaneous Exception Response
6078
6079 In cases where multiple exceptions arise simultaneously, the 80387 signals
6080 one exception according to the precedence shown at the end of Chapter 3.
6081 This means, for example, that an SNaN divided by zero results in an invalid
6082 operation, not in a zero divide exception.
6083
6084
6085 6.2.9  Exception Recovery Examples
6086
6087 Recovery routines for NPX exceptions can take a variety of forms. They can
6088 change the arithmetic and programming rules of the NPX. These changes may
6089 redefine the default fix-up for an error, change the appearance of the NPX
6090 to the programmer, or change how arithmetic is defined on the NPX.
6091
6092 A change to an exception response might be to automatically normalize all
6093 denormals loaded from memory. A change in appearance might be extending the
6094 register stack into memory to provide an "infinite" number of numeric
6095 registers. The arithmetic of the NPX can be changed to automatically extend
6096 the precision and range of variables when exceeded. All these functions can
6097 be implemented on the NPX via numeric exceptions and associated recovery
6098 routines in a manner transparent to the application programmer.
6099
6100 Some other possible application-dependent actions might include:
6101
6102   Ž  Incrementing an exception counter for later display or printing
6103
6104   Ž  Printing or displaying diagnostic information (e.g., the 80387
6105      environment andregisters)
6106
6107   Ž  Aborting further execution
6108
6109   Ž  Storing a diagnostic value (a NaN) in the result and continuing with
6110      the computation
6111
6112 Notice that an exception may or may not constitute an error, depending on
6113 the application. Once the exception handler corrects the condition causing
6114 the exception, the floating-point instruction that caused the exception can
6115 be restarted, if appropriate. This cannot be accomplished using the IRET
6116 instruction, however, because the trap occurs at the ESC or WAIT instruction
6117 following the offending ESC instruction. The exception handler must obtain
6118 (using FSAVE or FSTENV) the address of the offending instruction in the task
6119 that initiated it, make a copy of it, execute the copy in the context of the
6120 offending task, and then return via IRET to the current CPU instruction
6121 stream.
6122
6123 In order to correct the condition causing the numeric exception, exception
6124 handlers must recognize the precise state of the NPX at the time the
6125 exception handler was invoked, and be able to reconstruct the state of the
6126 NPX when the exception initially occurred. To reconstruct the state of the
6127 NPX, programmers must understand when, during the execution of an NPX
6128 instruction, exceptions are actually recognized.
6129
6130 Invalid operation, zero divide, and denormalized exceptions are detected
6131 before an operation begins, whereas overflow, underflow, and precision
6132 exceptions are not raised until a true result has been computed. When a
6133 before exception is detected, the NPX register stack and memory have
6134 not yet been updated, and appear as if the offending instructions has not
6135 been executed.
6136
6137 When an after exception is detected, the register stack and memory appear
6138 as if the instruction has run to completion; i.e., they may be updated.
6139 (However, in a store or store-and-pop operation, unmasked over/underflow is
6140 handled like a before exception; memory is not updated and the stack is not
6141 popped.) The programming examples contained in Chapter 7 include an outline
6142 of several exception handlers to process numeric exceptions for the 80387.
6143
6144
6145 Chapter 7  Numeric Programming Examples
6146
6147 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
6148
6149 The following sections contain examples of numeric programs for the 80387
6150 NPX written in ASM386. These examples are intended to illustrate some of the
6151 techniques for programming the 80386/80387 computing system for numeric
6152 applications.
6153
6154
6155 7.1  Conditional Branching Example
6156
6157 As discussed in Chapter 2, several numeric instructions post their results
6158 to the condition code bits of the 80387 status word. Although there are many
6159 ways to implement conditional branching following a comparison, the basic
6160 approach is as follows:
6161
6162   Ž  Execute the comparison.
6163
6164   Ž  Store the status word. (80387 allows storing status directly into AX
6165      register.)
6166
6167   Ž  Inspect the condition code bits.
6168
6169   Ž  Jump on the result.
6170
6171 Figure 7-1 is a code fragment that illustrates how two memory-resident
6172 double-format real numbers might be compared (similar code could be used
6173 with the FTST instruction). The numbers are called A and B, and the
6174 comparison is A to B.
6175
6176 The comparison itself requires loading A onto the top of the 80387 register
6177 stack and then comparing it to B, while popping the stack with the same
6178 instruction. The status word is then written into the 80386 AX register.
6179
6180 A and B have four possible orderings, and bits C3, C2, and C0 of the
6181 condition code indicate which ordering holds. These bits are positioned in
6182 the upper byte of the NPX status word so as to correspond to the CPU's zero,
6183 parity, and carry flags (ZF, PF, and CF), when the byte is written into the
6184 flags. The code fragment sets ZF, PF, and CF of the CPU status word to the
6185 values of C3, C2, and C0 of the NPX status word, and then uses the CPU
6186 conditional jump instructions to test the flags. The resulting code is
6187 extremely compact, requiring only seven instructions.
6188
6189 The FXAM instruction updates all four condition code bits. Figure 7-2 shows
6190 how a jump table can be used to determine the characteristics of the value
6191 examined. The jump table (FXAM_TBL) is initialized to contain the 32-bit
6192 displacement of 16 labels, one for each possible condition code setting.
6193 Note that four of the table entries contain the same value, "EMPTY." The
6194 first two condition code settings correspond to "EMPTY." The two other table
6195 entries that contain "EMPTY" will never be used on the 80387, but may be
6196 used if the code is executed with an 80287.
6197
6198 The program fragment performs the FXAM and stores the status word. It then
6199 manipulates the condition code bits to finally produce a number in register
6200 BX that equals the condition code times 2. This involves zeroing the unused
6201 bits in the byte that contains the code, shifting C3 to the right so that it
6202 is adjacent to C2, and then shifting the code to multiply it by 2. The
6203 resulting value is used as an index that selects one of the displacements
6204 from FXAM_TBL (the multiplication of the condition code is required because
6205 of the 2-byte length of each value in FXAM_TBL). The unconditional JMP
6206 instruction effectively vectors through the jump table to the labeled
6207 routine that contains code (not shown in the example) to process each
6208 possible result of the FXAM instruction.
6209
6210
6211 Figure 7-1.  Conditional Branching for Compares
6212
6213     .
6214     .
6215     .
6216 A   DQ  ?
6217 B   DQ  ?
6218     .
6219     .
6220     .
6221     FLD     A  ; LOAD A ONTO TOP OF 387 STACK
6222     FCOMP   B  ; COMPARE A:B, POP A
6223     FSTSW   AX ; STORE RESULT TO CPU AX REGISTER
6224 ;
6225 ; CPU AX REGISTER CONTAINS CONDITION CODES
6226 ;   (RESULTS OF COMPARE)
6227 ; LOAD CONDITION CODES INTO CPU FLAGS
6228 ;
6229     SAHF
6230 ;
6231 ; USE CONDITIONAL JUMPS TO DETERMINE ORDERING OF A TO B
6232 ;
6233     JP A_B_UNORDERED            ; TEST C2 (PF)
6234     JB A_LESS          ; TEST C0 (CF)
6235     JE A_EQUAL         ; TEST C3 (ZF)
6236 A_GREATER:             ; C0 (CF) = 0, C3 (ZF) = 0
6237     .
6238     .
6239 A_EQUAL:                   ; C0 (CF) = 0, C3 (ZF) = 1
6240     .
6241     .
6242 A_LESS:                    ; C0 (CF) = 1, C3 (ZF) = 0
6243     .
6244     .
6245 A_B_UNORDERED:             ; C2 (PF) = 1
6246     .
6247     .
6248
6249
6250 Figure 7-2.  Conditional Branching for FXAM
6251
6252 ; JUMP TABLE FOR EXAMINE ROUTINE
6253 ;
6254 FXAM_TBL  DD POS_UNNORM, POS NAN, NEG_UNNORM, NEG_NAN,
6255 &         POS_NORM, POS_INFINITY, NEG_NORM,
6256 &         NEG_INFINITY, POS_ZERO, EMPTY, NEG_ZERO,
6257 &         EMPTY, POS_DENORM, EMPTY, NEG_DENORM, EMPTY
6258     .
6259     .
6260 ; EXAMINE ST AND STORE RESULT (CONDITION CODES)
6261
6262     FXAM
6263     XOR EAX,EAX ; CLEAR EAX
6264     FSTSW AX
6265
6266 ; CALCULATE OFFSET INTO JUMP TABLE
6267
6268     AND AX,0100011100000000B ; CLEAR ALL BITS EXCEPT C3, C2-C0
6269     SHR EAX,6    ;  SHIFT C2-C0 INTO PLACE    (0000XXX0)
6270     SAL AH,5     ;  POSITION C3               (000X0000)
6271     OR  AL,AH    ;  DROP C3 IN ADJACENT TO C2 (000XXXX0)
6272     XOR AH,AH    ;  CLEAR OUT THE OLD COPY OF C3
6273
6274 ; JUMP TO THE ROUTINE `ADDRESSED' BY CONDITION CODE
6275
6276     JMP FXAM_TBL[EAX]
6277
6278 ; HERE ARE THE JUMP TARGETS, ONE TO HANDLE
6279 ;    EACH POSSIBLE RESULT OF FXAM
6280
6281 POS_UNNORM:
6282     .
6283 POS_NAN:
6284     .
6285 NEG_UNNORM:
6286     .
6287 NEG_NAN:
6288     .
6289 POS_NORM:
6290     .
6291 POS_INFINITY:
6292     .
6293 NEG_NORM:
6294     .
6295 NEG_INFINITY:
6296     .
6297 POS_ZERO:
6298     .
6299 EMPTY:
6300     .
6301 NEG_ZERO:
6302     .
6303 POS_DENORM:
6304     .
6305 NEG_DENORM:
6306
6307
6308 7.2  Exception Handling Examples
6309
6310 There are many approaches to writing exception handlers. One useful
6311 technique is to consider the exception handler procedure as consisting of
6312 "prologue," "body," and "epilogue" sections of code. This procedure is
6313 invoked via interrupt number 16.
6314
6315 At the beginning of the prologue, CPU interrupts have been disabled. The
6316 prologue performs all functions that must be protected from possible
6317 interruption by higher-priority sources. Typically, this involves saving CPU
6318 registers and transferring diagnostic information from the 80387 to memory.
6319 When the critical processing has been completed, the prologue may enable CPU
6320 interrupts to allow higher-priority interrupt handlers to preempt the
6321 exception handler.
6322
6323 The body of the exception handler examines the diagnostic information and
6324 makes a response that is necessarily application-dependent. This response
6325 may range from halting execution, to displaying a message, to attempting to
6326 repair the problem and proceed with normal execution.
6327
6328 The epilogue essentially reverses the actions of the prologue, restoring
6329 the CPU and the NPX so that normal execution can be resumed. The epilogue
6330 must not load an unmasked exception flag into the 80387 or another exception
6331 will be requested immediately.
6332
6333 Figures 7-3 through 7-5 show the ASM386 coding of three skeleton
6334 exception handlers. They show how prologues and epilogues can be written for
6335 various situations, but provide comments indicating only where the
6336 application dependent exception handling body should be placed.
6337
6338 Figures 7-3 and 7-4 are very similar; their only substantial difference is
6339 their choice of instructions to save and restore the 80387. The tradeoff
6340 here is between the increased diagnostic information provided by FNSAVE and
6341 the faster execution of FNSTENV. For applications that are sensitive to
6342 interrupt latency or that do not need to examine register contents, FNSTENV
6343 reduces the duration of the "critical region," during which the CPU does not
6344 recognize another interrupt request.
6345
6346 After the exception handler body, the epilogues prepare the CPU and the NPX
6347 to resume execution from the point of interruption (i.e., the instruction
6348 following the one that generated the unmasked exception). Notice that the
6349 exception flags in the memory image that is loaded into the 80387 are
6350 cleared to zero prior to reloading (in fact, in these examples, the entire
6351 status word image is cleared).
6352
6353 The examples in Figures 7-3 and 7-4 assume that the exception handler
6354 itself will not cause an unmasked exception. Where this is a possibility,
6355 the general approach shown in Figure 7-5 can be employed. The basic
6356 technique is to save the full 80387 state and then to load a new control
6357 word in the prologue. Note that considerable care should be taken when
6358 designing an exception handler of this type to prevent the handler from
6359 being reentered endlessly.
6360
6361
6362 Figure 7-3.  Full-State Exception Handler
6363
6364 SAVE_ALL         PROC
6365 ;
6366 ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE
6367 ; FOR 80387 STATE IMAGE
6368     PUSH EBP
6369     MOV  EBP,ESP
6370     SUB  ESP,108
6371 ; SAVE FULL 80387 STATE, ENABLE CPU INTERRUPTS
6372     FNSAVE  [EBP-108]
6373     STI
6374 ;
6375 ; APPLICATION-DEPENDENT EXCEPTION HANDLING
6376 ; CODE GOES HERE
6377 ;
6378 ; CLEAR EXCEPTION FLAGS IN STATUS WORD
6379 ;  (WHICH IS IN MEMORY)
6380 ; RESTORE MODIFIED STATE IMAGE
6381     MOV BYTE PTR [EBP-104], 0H
6382     FRSTOR  [EBP-108]
6383 ; DEALLOCATE STACK SPACE, RESTORE CPU REGISTERS
6384     MOVE ESP,EBP
6385       .
6386       .
6387     POP EBP
6388 ;
6389 ; RETURN TO INTERRUPTED CALCULATION
6390     IRET
6391 SAVE_ALL         ENDP
6392
6393
6394 Figure 7-4.  Reduced-Latency Exception Handler
6395
6396 SAVE_ENVIRONMENT PROC
6397 ;
6398 ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE
6399 ; FOR 80387 ENVIRONMENT
6400     PUSH    EBP
6401        .
6402     MOV     EBP,ESP
6403     SUB     ESP,28
6404 ; SAVE ENVIRONMENT, ENABLE CPU INTERRUPTS
6405     FNSTENV [EBP-28]
6406     STI
6407 ;
6408 ; APPLICATION EXCEPTION-HANDLING CODE GOES HERE
6409 ;
6410 ; CLEAR EXCEPTION FLAGS IN STATUS WORD
6411 ;  (WHICH IS IN MEMORY)
6412 ; RESTORE MODIFIED ENVIRONMENT IMAGE
6413     MOV     BYTE PTR [EBP-24], 0H
6414     FLDENV  [EBP-28]
6415 ; DE-ALLOCATE STACK SPACE, RESTORE CPU REGISTERS
6416     MOV     ESP,EBP
6417     POP     EBP
6418 ;
6419 ; RETURN TO INTERRUPTED CALCULATION
6420     IRET
6421 SAVE_ENVIRONMENT ENDP
6422
6423
6424 Figure 7-5.  Reentrant Exception Handler
6425
6426         .
6427         .
6428         .
6429     LOCAL CONTROL  DW  ?  ; ASSUME INITIALIZED
6430         .
6431         .
6432         .
6433 REENTRANT             PROC
6434 ;
6435 ; SAVE CPU REGISTERS, ALLOCATE STACK SPACE FOR
6436 ; 80387 STATE IMAGE
6437     PUSH    EBP
6438        .
6439        .
6440        .
6441     MOV     EBP,ESP
6442     SUB     ESP,108
6443 ; SAVE STATE, LOAD NEW CONTROL WORD,
6444 ; ENABLE CPU INTERRUPTS
6445     FNSAVE  [EBP-108]
6446     FLDCW   LOCAL_CONTROL
6447     STI
6448        .
6449        .
6450        .
6451 ; APPLICATION EXCEPTION HANDLING CODE GOES HERE.
6452 ; AN UNMASKED EXCEPTION GENERATED HERE WILL
6453 ; CAUSE THE EXCEPTION HANDLER TO BE REENTERED.
6454 ; IF LOCAL STORAGE IS NEEDED, IT MUST BE
6455 ; ALLOCATED ON THE CPU STACK.
6456        .
6457        .
6458        .
6459 ; CLEAR EXCEPTION FLAGS IN STATUS WORD
6460 ;  (WHICH IS IN MEMORY)
6461 ; RESTORE MODIFIED STATE IMAGE
6462     MOV     BYTE PTR [EBP-104], 0H
6463     FRSTOR  [EBP-108]
6464 ; DE-ALLOCATE STACK SPACE, RESTORE CPU REGISTERS
6465     MOV     ESP,EBP
6466        .
6467        .
6468        .
6469     POP     EBP
6470 ; RETURN TO POINT OF INTERRUPTION
6471     IRET
6472 REENTRANT             ENDP
6473
6474
6475 7.3  Flaoting-Point to ASCII Conversion Examples
6476
6477 Numeric programs must typically format their results at some point for
6478 presentation and inspection by the program user. In many cases, numeric
6479 results are formatted as ASCII strings for printing or display. This example
6480 shows how floating-point values can be converted to decimal ASCII character
6481 strings. The function shown in Figure 7-6 can be invoked from PL/M-386,
6482 Pascal-386, FORTRAN-386, or ASM386 routines.
6483
6484 Shortness, speed, and accuracy were chosen rather than providing the
6485 maximum number of significant digits possible. An attempt is made to keep
6486 integers in their own domain to avoid unnecessary conversion errors.
6487
6488 Using the extended precision real number format, this routine achieves a
6489 worst case accuracy of three units in the 16th decimal position for a
6490 noninteger value or integers greater than 10^(18). This is double precision
6491 accuracy. With values having decimal exponents less than 100 in magnitude,
6492 the accuracy is one unit in the 17th decimal position.
6493
6494 Higher precision can be achieved with greater care in programming, larger
6495 program size, and lower performance.
6496
6497
6498 Figure 7-6.  Floating-Point to ASCII Conversion Routine
6499
6500 XENIX286 80380 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE FLOATING_TO_ASCII
6501 OBJECT MODULE PLACED IN fpasc.obj
6502 ASSEMBLER INVOKED BY: asm386 fpasc.asm
6503
6504 LOC      OBJ                          LINE    SOURCE
6505
6506                                 1 +1  $title(`Convert a floating point number to ASCII')
6507                                 2
6508                                 3                    name    floating_to_ascii
6509                                 4
6510 00000000                        5                    public  floating_to_ascii
6511                                 6                    extrn   get_power_10:near,tos_status:near
6512                                 7    ;
6513                                 8    ; This subroutine will convert the floating point
6514                                 9    ; number in the top of the NPX stack to an ASCII
6515                                10    ; string and separate power of 10 scaling value
6516                                11    ; (in binary).  The maximum width of the ASCII string
6517                                12    ; formed is controlled by a parameter which must be
6518                                13    ; > 1.  Unnormal values, denormal values, and psuedo
6519                                14    ; zeroes will be correctly converted. However, unnormals
6520                                15    ; and pseudo zeros are no longer supported formats on the
6521                                16    ; 80387( in conformance with the IEEE floating point
6522                                17    ; standard) and hence not generated internally. A
6523                                18    ; returned value will indicate how many binary bits
6524                                19    ; of precision were lost in an unnormal or denormal
6525                                20    ; value.  The magnitude (in terms of binary power)
6526                                21    ; of a pseudo zero will also be indicated. Integers
6527                                22    ; less than 10**18 in magnitude are accurately converted
6528                                23    ; if the destination ASCII string field is wide enough
6529                                24    ; to hold all the digits. Otherwise the value is converted
6530                                25    ; to scientific notation.
6531                                26    ;
6532                                27    ; The status of the conversion is identified by the
6533                                28    ; return value, it can be:
6534                                29    ;
6535                                30    ;       0       conversion complete, string_size is defined
6536                                31    ;       1       invalid arguments
6537                                32    ;       2       exact integer conversion, string_size is defined
6538                                33    ;       3       indefinite
6539                                34    ;       4       + NAN (Not A Number)
6540                                35    ;       5       - NAN
6541                                36    ;       6       + Infinity
6542                                37    ;       7       - Infinity
6543                                38    ;       8       pseudo zero found, string_size is defined
6544                                39    ;
6545                                40    ;         The PLM/386 calling convention is:
6546                                41    ;
6547                                42    ; floating_to_ascii:
6548                                43    ;       procedure (number,denormal_ptr,string_ptr,size_ptr,
6549                                44    ;       field_size, power_ptr) word external;
6550                                45    ;       declare (denormal_ptr,string_ptr,power_ptr,size_ptr)
6551                                46    ;       pointer;
6552                                47    ;       declare field_size word,
6553                                48    ;       string_size based size ptr word;
6554                                49    ;       declare number real;
6555                                50    ;       declare denormal integer based denormal ptr;
6556                                51    ;       declare power integer based power_ptr;
6557                                52    ;       end floating_to_ascii:
6558                                53    ;
6559                                54    ;         The floating point value is expected to be
6560                                55    ;   on the top of the  NPX stack.  This subroutine
6561                                56    ;   expects 3 free entries on the NPX stack and
6562                                57    ;   will pop the passed value off when done.  The
6563                                58    ;   generated ASCII string will have a leading
6564                                59    ;   character either `-' or `+' indicating the sign
6565                                60    ;   of the value.  The ASCII decimal digits will
6566                                61    ;   immediately follow. The numeric value of the
6567                                62    ;   ASCII string is (ASCII STRING.)*10**POWER. If
6568                                63    ;   the given number was zero, the ASCII string will
6569                                64    ;   contain a sign and a single zero chacter.  The
6570                                65    ;   value string_size indicates the total length of
6571                                66    ;   the ASCII string including the sign character.
6572                                67    ;   String(0) will always hold the sign.  It is
6573                                68    ;   possible for string size to be less than
6574                                69    ;   field_size. This occurs for zeroes or integer
6575                                70    ;   values.  A pseudo zero will return a special
6576                                71    ;   return code.  The denormal count will indicate
6577                                72    ;      the power of two originally associated with the
6578                                73    ;   value.  The power of ten and ASCII string will
6579                                74    ;   be as if the value was an ordinary zero.
6580                                75    ;
6581                                76    ;   This subroutine is accurate up to a maximum of
6582                                77    ;   18 decimal digits for integers.  Integer values
6583                                78    ;   will have a decimal power of zero associated
6584                                79    ;   with them. For non integers, the result will be
6585                                80    ;   accurate to within 2 decimal digits of the 16th
6586                                81    ;   decimal place(double precision).  The exponentiate
6587                                82    ;   instruction is also used for scaling the value into
6588                                83    ;   the range acceptable for the BCD data type.  The
6589                                84    ;   roundirg mode in effect on entry to the
6590                                85    ;   subroutine is used for the conversion.
6591                                86    ;
6592                                87    ;         The following registers are not transparent:
6593                                88    ;
6594                                89    ;               eax ebx ecx edx esi edi eflags
6595                                90    ;
6596                                91    ;
6597                                92    ;         Define the stack layout.
6598                                93    ;
6599 00000000[]                     94    ebp_save       equ     dword ptr [ebp]
6600 00000004[]                     95    es_save        equ     ebp_save + size ebp_save
6601 00000008[]                     96    return_ptr     equ     es_save + size es_save
6602 0000000C[]                     97    power_ptr      equ     return_ptr + size return_ptr
6603 00000010[]                     98    field_size     equ     power_ptr + size power_ptr
6604 00000014[]                     99    size_ptr       equ     field_size + size field_size
6605 00000018[]                    100    string_ptr     equ     size_ptr + size size_ptr
6606 0000001C[]                    101    denormal_ptr   equ     string_ptr + size string_ptr
6607                               102
6608 0014                          103    parms_size     equ     size power_ptr + size field_size +
6609                               104    &              size size_ptr + size string_ptr +
6610                               105    &              size denormal_ptr
6611                               106    ;
6612                               107    ;        Define constants used
6613                               108    ;
6614                               109    BCD_DIGITS     equ     18       ; Number of digits in bcd_value
6615                               110    WORD_SIZE      equ     4
6616                               111    BCD_SIZE       equ     10
6617                               112    MINUS          equ     1        ; Define return values
6618                               113    NAN            equ     4        ; The exact values chosen
6619                               114    INFINITY       equ     6        ; here are important.  They must
6620                               115    INDEFINITE     equ     3        ; correspond to the possible return
6621                               116    PSEUDO_ZERO    equ     8        ; values and be in the same numeric
6622                               117    INVALID        equ     -2       ; order as tested by the program.
6623                               118    ZERO           equ     -4
6624                               119    DENORMAL       equ     -6
6625                               120    UNNORMAL       equ     -8
6626                               121    NORMAL         equ     0
6627                               122    EXACT          equ     2
6628                               123    ;
6629                               124    ;        Define layout of temporary storage area.
6630                               125    ;
6631                               126    power_two      equ     word ptr [ebp - WORD_SIZE]
6632                               127    bcd_value      equ     tbyte ptr power two - BCD_SIZE
6633                               128    bcd_byte       equ     byte ptr bcd_value
6634                               129    fraction       equ     bcd_value
6635                               130
6636                               131    local_size     equ     size power_two + size bcd_value
6637                               132    ;
6638                               133    ;        Allocate stack space for the temporaries so
6639                               134    ;    the stack will be big enough
6640                               135    ;
6641                               136    stack  stackseg (local_size+6) ; Allocate stack
6642                               137                                   ; space for locals
6643                               138 +1 $eject
6644                               139    code           segment public er
6645                               140                   extrn   power_table:qword
6646                               141    ;
6647                               142    ;       Constants used by this function.
6648                               143    ;
6649                               144                   even                    ; Optimize for 16 bits
6650 00000000 0A00                 145    const10        dw      10              ; Adjustment value for
6651                               140    ;                      ; too big BCD
6652                               147    ;
6653                               148    ; Convert the C3,C2,C1,C0 encoding from tos_status
6654                               149    ; into meaningful bit flags and values.
6655                               150    ;
6656 00000002 F8                   151    status_table   db      UNNORMAL, NAN, UNNORMAL + MINUS,
6657 00000003 04                   152    &       NAN + MINUS, NORMAL, INFINITY,
6658 00000004 F9                   153    &       NORMAL + MINUS,  INFINITY + MINUS,
6659 00000005 05                   154    &              ZERO, INVALID, ZERO + MINUS, INVALID,
6660 00000006 00                   155    &              DENORMAL, INVALID, DENORMAL + MINUS, INVALID
6661 00000007 06
6662 00000008 01
6663 00000009 07
6664 0000000A FC
6665 0000000B FE
6666 0000000C FD
6667 0000000D FE
6668 0000000E FA
6669 0000000F FE
6670 00000010 FB
6671 00000011 FE
6672                               156
6673 00000012                      157    floating_to_ascii proc
6674                               158
6675 00000012 E800000000        E  159            call   tos_status       ; Look at status of ST(0)
6676                               160
6677                               161    ; Get descriptor from table
6678 00000017 2E0FB68002000000  R  162            movzx  eax, status_table[eax]
6679 0000001F 3CFE                 163            cmp    al,INVALID              ; Look for empty ST(0)
6680 00000021 7527                 164            jne    not_empty
6681                               165    ;
6682                               166    ;         ST(0) is empty!  Return the status value.
6683                               167    ;
6684 00000023 C21400               168            ret    parms_size
6685                               169    ;
6686                               170    ;         Remove infinity from stack and exit.
6687                               171    ;
6688 00000026                      172    found_infinity:
6689 00000026 DDD8                 173            fstp   st(0)            ; OK to leave fstp running
6690 00000028 EB02                 174            jmp    short exit_proc
6691                               175    ;
6692                               176    ;         String space is too small!
6693                               177    ;      Return invalid code.
6694                               178    ;
6695 0000002A                      179    small_string:
6696 0000002A B0FE                 180            mov    al,INVALID
6697 0000002C                      181    exit_proc:
6698 0000002C C9                   182            leave          ; Restore stack setup
6699 0000002D 07                   183            pop     es
6700 0000002E C21400               184            ret     parms_size
6701                               185    ;
6702                               186    ; ST(0) is NAN or indefinite.  Store the
6703                               187    ; value in memory and look at the fraction
6704                               188    ; field to separate indefinite from an ordinary NAN.
6705                               189    ;
6706 00000031                      190    NAN_or_indefinite:
6707 00000031 DB7DF2               191            fstp    fraction        ; Remove value from stack
6708                               192                                ; for examination
6709 00000034 A801                 193            test    al,MINUS        ; Look at sign bit
6710 00000036 9B                   194            fwait                           ; Insure store is done
6711 00000037 74F3                 195            jz      exit_proc               ; Can't be indefinite if
6712                               196                                ; positive
6713                               197
6714 00000039 BB000000C0           198            mov     ebx,0C0000000H  ; Match against upper 32
6715                               199                                ;bits of fraction
6716                               200
6717                               201    ; Compare bits 63-32
6718 0000003E 2B5DF6               202            sub     ebx, dword ptr fraction + 4
6719                               203
6720                               204    ; Bits 31-0 must be zero
6721 00000041 0B5DF2               205            or      ebx, dword ptr fraction
6722 00000044 75E6                 206            jnz     exit_proc
6723                               207
6724                               208    ; Set return value for indefinite value
6725 00000046 B003                 209        mov al,INDEFINITE
6726 00000048 EBE2                 210            jmp     exit_proc
6727                               211    ;
6728                               212    ;         Allocate stack space for local variables
6729                               213    ;     and establish parameter addressibility.
6730                               214    ;
6731 0000004A                      215    not_empty:
6732 0000004A 06                   216            push    es              ; Save working register
6733 0000004B C80C0000             217            enter local_size, 0     ; Setup stack addressing
6734                               218
6735                               219
6736                               220    ; Check for enough string space
6737 0000004F 8B4D10               221            mov     ecx,field size
6738 00000052 83F902               222            cmp     ecx,2
6739 00000055 7CD3                 223            jl      small_string
6740                               224
6741 00000057 49                   225            dec     ecx             ; Adjust for sign character
6742                               226
6743                               227    ; See if string is too large for BCD
6744 00000058 83F912               228            cmp     ecx,BCD_DIGITS
6745 0000005B 7605                 229            jbe     size_ok
6746                               230
6747                               231    ; Else set maximum string size
6748 0000005D B912000000           232            mov     ecx,BCD_DIGITS
6749 00000002                      233    size_ok:
6750 00000062 3C06                 234            cmp     al,INFINITY     ; Look for infinity
6751                               235
6752                               236    ; Return status value for + or - inf
6753 00000064 7DC0                 237            jge     found_infinity
6754                               238
6755 00000066 3C04                 239             cmp     al,NAN          ; Look for NAN or INDEFINITE
6756 00000068 7DC7                 240             jge     NAN_or_indefinite
6757                               241    ;
6758                               242    ;  Set default return values and check that
6759                               243    ;  the number is normalized.
6760                               244    ;
6761 0000006A D9E1                 245            fabs    ; Use positive value only
6762                               246                            ; sign bit in al has true sign of value
6763 0000006C 31D2                 247            xor     edx,edx                 ; Form 0 constant
6764 0000006E 8B7D1C               248            mov     edi,denormal_ptr; Zero denormal count
6765 00000071 668917               249            mov     [edi], dx
6766 00000074 8B5D0C               250            mov     ebx,power_ptr   ; Zero power of ten value
6767 00000077 668913               251            mov     [ebx], dx
6768 0000007A 88C2                 252            mov dl, al
6769 0000007C 80E201               253            and dl, 1
6770 0000007F 80C202               254        add dl, EXACT
6771 00000082 3CFC                 255            cmp     al,ZERO                 ; Test for zero
6772 00000084 0F83BC000000         256            jae     convert_integer ; Ship power code if value
6773                               257                                                    ; is zero
6774 0000008A DB7DF2               258        fstp   fraction
6775 00000080 9B                   259        fwait
6776 0000008E 8A45F9               260        mov    al, bcd_byte + 7
6777 00000091 804DF980             261        or     byte ptr bcd_byte +  7, 80h
6778 00000095 DB6DF2               262        fld    fraction
6779 00000098 D9F4                 263        fxtract
6780 0000009A A880                 264        test   al, 80h
6781 0000009C 7524                 265        jnz    normal_value
6782                               266
6783 0000009E D9E8                 267        fld1
6784 000000A0 DEE9                 268        fsub
6785 000000A2 D9E4                 269        ftat
6786 000000A4 9BDFE0               270        fatsw  ax
6787 000000A7 9E                   271        sahf
6788 000000A8 7510                 272        jnz    set_unnormal_count
6789                               273    ;
6790                               274    ;   Found a pseudo zero
6791                               275    ;
6792 000000AA D9EC                 276        fldlg2              ; Develop power of ten estimate
6793 000000AC 80C206               277        add    dl, PSEUDO ZERO - EXACT
6794 000000AF DECA                 278        fmulp  st(2), st
6795 000000B1 D9C9                 279        fxch                  ; Get power of ten
6796 000000B3 DF1B                 280        fistp  word ptr [ebx] ; Set power of ten
6797 000000B5 E98C000000           281        jmp    convert_integer
6798                               282
6799 000000BA                      283    set_unnonmal_count:
6800 000000BA D9F4                 284        fxtract               ; Get original fraction,
6801                               285                              ; now normalized
6802 000000BC D9C9                 286        fxch                  ; Get unnormal count
6803 000000BE D9E0                 287        fchs
6804 000000C0 DF1F                 288        fistp  word ptr [edi] ; Set unnormal count
6805                               289
6806                               290
6807                               291    ;  Calculate the decimal magnitude associated
6808                               292    ;  with this number to within one order.  This
6809                               293    ;  error will always be inevitable due to
6810                               294    ;  rounding and lost precision. As a result,
6811                               295    ;  we will deliberately fail to consider the
6812                               296    ;  LOG10 of the fraction value in calculating
6813                               297    ;  the order. Since the fraction will always
6814                               298    ;  be 1 <= F < 2, its  LOG10 will not change
6815                               299    ;  the basic accuracy of the function. To
6816                               300    ;  get the decimal order of magnitude, simply
6817                               301    ;  multiply the power of two by LOG10(2) and
6818                               302    ;  truncate the result to an integer.
6819                               303    ;
6820                               304    normal_value:
6821                               305            fstp   fraction         ; Save the fraction field
6822                               306                               ; for later use
6823                               307            fist   power_two        ; Save power of two
6824                               308            fldlg2                          ; Get LOG10(2)
6825                               309                                            ; Power_two is now safe to use
6826                               310            fmul            ; Form LOG10(of exponent of number)
6827                               311            fistp  word ptr [ebx]   ; Any rounding mode
6828                               312                                                            ; will work here
6829                               313    ;
6830                               314    ;         Check if the magnitude of the number rules
6831                               315    ;      out treating it as an integer.
6832                               316    ;
6833                               317    ;       CX has the maximum number of decimal digits
6834                               318    ;   allowed.
6835                               319    ;
6836                               320            fwait           ; Wait for power_ten to be valid
6837                               321
6838                               322    ; Get power of ten of value
6839                               323            movsx si, word ptr [ebx]
6840                               324            sub    esi,ecx                  ; Form scaling factor
6841                               325                                ; necessary in ax
6842                               326            ja     adjust result    ; Jump if number will not fit
6843                               327    ;
6844                               328    ;         The number is between 1 and 10**(field size).
6845                               329    ;       Test if it is an integer.
6846                               330    ;
6847                               331            fild   power_two        ; Restore original number
6848                               332            sub    dl,NORMAL-EXACT  ; Convert to exact return
6849                               333                                ; value
6850                               334            fld    fraction
6851                               335            fscale                          ; Form full value,  this
6852                               336                                ; is safe here
6853                               337            fst    st(1)                    ; Copy value for compare
6854                               338            frndint                         ; Test if its an integer
6855                               339            fcomp                           ; Compare values
6856                               340            fstsw  ax                       ; Save status
6857                               341            sahf                            ; C3=1 implies it was
6858                               342                                ; an integer
6859                               343            jnz    convert_integer
6860                               344
6861                               345            fstp   st(0)            ; Remove non integer value
6862                               346            add    dl,NORMAL-EXACT  ; Restore original return value
6863                               347    ;
6864                               348    ;      Scale the number to within the range allowed
6865                               349    ;  by the BCD format.The scaling operation should
6866                               350    ;  produce a number within one decimal order of
6867                               351    ;  magnitude of the largest decimal number
6868                               352    ;  representable within the given string width.
6869                               353    ;
6870                               354    ;        The scaling power of ten value is in si.
6871                               355    ;
6872 000000F2                      356    adjust_result:
6873 000000F2 8BC6                 357           mov     eax,esi                 ; Setup for pow10
6874 000000F4 668903               358           mov     word ptr [ebx],ax       ; Set initial power
6875                               359                                    ; of ten return value
6876 000000F7 F7D8                 360           neg     eax              ; Subtract one for each order of
6877                               361                                    ; magnitude the value is scaled by
6878 000000F9 E800000000        E  362           call    get_power_10     ; Scaling factor is
6879                               363                                    ; returned as
6880                               364                                    ; exponent and fraction
6881 000000FE DB6DF2               365           fld     fraction                         ; Get fraction
6882 00000101 DEC9                 366           fmul                                     ; Combine fractions
6883 00000103 8BF1                 367           mov     esi,ecx                 ; Form power of ten of
6884                               368                                    ; the maximum
6885 00000105 C1E603               369           shl     esi,3                            ; BCD value to fit in
6886                               370                                    ; the strinq
6887 00000108 DF45FC               371           fild    power_two               ; Combine powers of two
6888 0000010B DEC2                 372           faddp   st(2),st
6889 0000010D D9FD                 373           fscale                                   ; Form full value,
6890                               374                                    ; exponent was safe
6891 0000010F DDD9                 375           fstp    st(1)                   ; Remove exponent
6892                               376    ;
6893                               377    ;        Test the adjusted value against a table
6894                               378    ;    of exact powers of ten. The combined errors
6895                               379    ;    of the magnitude estimate and power function
6896                               380    ;    can result in a value one order of magnitude
6897                               381    ;    too small or too large to fit correctly in
6898                               382    ;    the BCD field. To handle this problem, pretest
6899                               383    ;    the adjusted value, if it is too small or
6900                               384    ;    large, then adjust it by ten and adjust the
6901                               385    ;    power of ten value.
6902                               386    ;
6903 00000111                      387    test_power:
6904                               388
6905                               389    ; Compare against exact power entry. Use the next
6906                               390    ; entry since cx has been decremented by one
6907 00000111 2EDC9608000000    E  391            fcom    power_table[esi]+type power_table
6908 00000118 9BDFE0               392            fstsw ax                        ; No wait is necessary
6909 0000011B 9E                   393            sahf                ; If C3 = C0 = 0 then
6910 0000011C 720F                 394            jb      test_for_small  ; too big
6911                               395
6912 0000011E 2EDE3500000000    R  396            fidiv   const10          ; Else adjust value
6913 00000125 80E2FD               397            and     dl,not EXACT     ; Remove exact flag
6914 00000128 66FF03               398            inc     word ptr [ebx]   ; Adjust power of ten value
6915 0000012B EB17                 399            jmp     short in range   ; Convert the value to a BCD
6916                               400                                ;  integer
6917 0000012D                      401    test for small:
6918 0000012D 2EDC9600000000    E  402            fcom    power table[esi]        ; Test relative size
6919 0000134 9BDFE0                403            fstsw   ax                                      ; No wait is necessary
6920 0000137 9E                    404            sahf                                            ; If CO = 0 then
6921                               405                                            ; st(O) >= lower bound
6922 10000138 720A                 406            jc      in_range                                ; Convert the value to a
6923                               407                                            ; BCD integer
6924                               408
6925 000013A 2EDE0D00000000     R  409            fimul   const10         ; Adjust value into range
6926 0000141 66FF0B                410            dec     word ptr [ebx]  ; Adjust power of ten value
6927 0000144                       411    in_range:
6928 0000144 D9FC                  412            frndint                         ; Form integer value
6929                               413    ;
6930                               414    ;       Assert: 0 <= TOS <= 999,999,999,999,999,999
6931                               415    ;       The TOS number will be exactly representable
6932                               416    ;    in 18 digit BCD format.
6933                               417    ;
6934 00000146                      418    convert_integer:
6935 00000146 DF75F2               419            fbstp   bcd_value       ; Store as BCD format number
6936                               420    ;
6937                               421    ;         while the store BCD runs, setup registers
6938                               422    ;      for the conversion to ASCII.
6939                               423    ;
6940 00000149 BE08000000           424            mov     esi,BCD_SIZE.2  ; Initial BCD index value
6941 0000014E 66B9040F             425            mov     cx,0f04h                ; Set shift count and mask
6942 00000152 BB01000000           426            mov     ebx,1                   ; Set initial size of ASCII
6943                               427                                ; field for sign
6944 00000157 8B7D18               428            mov     edi,string_ptr  ; Get address of start of
6945                               429                                ; ASCII string
6946 0000015A 8CD8                 430            mov     ax,ds                   ; Copy ds to es
6947 0000015C 8EC0                 431            mov     es,ax
6948 0000015E FC                   432            cld                                     ; Set autoincrement mode
6949 0000015F B02B                 433            mov     al,'+'                  ; Clear sign field
6950 00000161 F6C201               434            test    dl,MINUS        ; Look for negative value
6951 00000164 7402                 435            jz      positive_result
6952                               436
6953 00000166 B02D                 437            mov     al,`.'
6954 00000168                      438    positive_result:
6955 00000168 AA                   439            stosb                           ; Bump string pointer
6956                               440                                ; past sign
6957 00000169 80E2FE               441            and     dl,not MINUS    ; Turn off sign bit
6958 0000016C 9B                   442            fwait                           ; Hait for fbstp to finish
6959                               443    ;
6960                               444    ;         Register usage:
6961                               445    ;                               ah:     BCD byte value in use
6962                               446    ;                               al:     ASCII character value
6963                               447    ;                               dx:     Return value
6964                               448    ;                               ch:     BCD mask = 0fh
6965                               449    ;                               cl:     BCD shift count = 4
6966                               450    ;                               bx:     ASCII string field width
6967                               451    ;                               esi:    BCD field index
6968                               452    ;                               di:     ASCII string field pointer
6969                               453    ;                               ds,es:  ASCII string segment base
6970                               454    ;
6971                               455    ;         Remove leading zeroes from the number.
6972                               456    ;
6973 0000016D                      457    skip_leading_zeroes:
6974 0000016D 8A6435F2             458           mov     ah,bcd_byte[esi]                ; Get BCD byte
6975 00000171 88E0                 459           mov     al,ah                   ; Copy value
6976 00000173 D2E8                 460           shr     al,cl                   ; Get high order digit
6977 00000175 240F                 461           and     al,0fh                  ; Set zero flag
6978 00000177 7517                 462           jnz     enter_odd               ; Exit loop if leading
6979                               463                               ; non zero found
6980                               464
6981 00000179 88E0                 465           mov     al,ah                   ; Get BCD byte again
6982 0000017B 240F                 466           and     al,0fh                  ; Get low order digit
6983 0000017D 7519                 467           jnz     enter_even              ; Exit loop if non zero
6984                               468                               ; digit found
6985                               469
6986 0000017F 4E                   470           dec     esi                             ; Decrement BCD index
6987 00000180 79EB                 471           jns     ship_leading_zeroes
6988                               472    ;
6989                               473    ;        The significand was all zeroes.
6990                               474    ;
6991 00000182 B030                 475           mov     al,`O'                  ; Set initial zero
6992 00000184 AA                   476           stosb
6993 00000185 43                   477           inc     ebx                             ; Bump string length
6994 00000186 EB17                 478           jmo     short exit_with_value
6995                               479    ;
6996                               480    ;        Now expand the BCD string into digit
6997                               481    ;     per byte values 0-9.
6998                               482    ;
6999 00000188                      483    digit_loop:
7000 00000188 8A6435F2             484           mov     ah,bcd_byte[esi]        ; Get BCD byte
7001 0000018C 88E0                 485           mov     al,ah
7002 0000018E D2E8                 486           shr     al,cl                   ; Get high order digit
7003 00000190                      487    enter_odd:
7004 00000190 0430                 488           add     al,`O'                  ; Convert to ASCII
7005 00000192 AA                   489           stosb                           ; Put digit into ASCII
7006                               490                               ; string area
7007 00000193 88E0                 491           mov     al,ah                   ; Get low order digit
7008 00000195 240F                 492           and     al,0fh
7009 00000197 43                   493           inc     ebx                     ; Bump field size counter
7010 00000198                      494    enter_even:
7011 00000198 0430                 495           add     al,`0'           ; Convert to ASCII
7012 0000019A AA                   496           stosb                    ; Put digit into ASCII area
7013 0000019B 43                   497           inc     ebx                     ; Bump field size counter
7014 0000019C 4E                   498           dec     esi                     ; Go to next BCD byte
7015 0000019D 79E9                 499           jns     digit_loop
7016                               500    ;
7017                               501    ;        Conversion complete.  Set the string
7018                               502    ;     size and remainder.
7019                               503    ;
7020 0000019F                      504    exit_with_value:
7021 0000019F 8B7D14               505           mov     edi,size_ptr
7022 000001A2 66891F               506           mov     word ptr [edi],bx
7023 000001A5 8BC2                 507           mov     eax,edx                 ; Set return value
7024 000001A7 E980FEFFFF           508           jmp     exit_proc
7025                               509
7026 000001AC                      510    floating_to_ascii      endp
7027                               511
7028 --------                      512    code                   ends
7029                               513                           end
7030
7031 ASSEMBLY COMPLETE,   NO WARNINGS,   NO ERRORS.
7032
7033
7034 XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE_GET_POWER 10
7035 OBJECT MODULE PLACED IN power10.obj
7036 ASSEMBLER INVOKED BY: asm386 power10.asm
7037
7038 LOC      OBJ                         LINE     SOURCE
7039
7040                              1 +1 $title(Calculate the value of 10**ax)
7041                              2    ;
7042                              3    ;       This subroutine will calculate the
7043                              4    ;    value of 10**eax.  For values of
7044                              5    ;    0 <= eax < 19, the result will exact.
7045                              6    ;    All 80386 registers are transparent
7046                              7    ;    and the value is returned on the TOS
7047                              8    ;    as two numbers, exponent in ST(1) and
7048                              9    ;    fraction in ST(0). The exponent value
7049                             10    ;    can be larger than the largest
7050                             11    ;    exponent of an extended real format
7051                             12    ;    number.  Three stack entries are used.
7052                             13    ;
7053                             14                    name    get_power_10
7054 00000000                    15                    public  get_power_10,power_table
7055                             16
7056 --------                    17    stack           stackseg    8
7057                             18
7058 --------                    19    code            segment public er
7059                             20    ;
7060                             21    ;         Use exact values from 1.0 to 1e18.
7061                             22    ;
7062                             23                    even            ; Optimize 16 bit access
7063 00000000 000000000000F03F   24    power_table     dq      1.0,1e1,1e2,1e3
7064 00000008 00000000000D2440
7065 00000010 0000000000005940
7066 00000018 0000000000408F40
7067 00000020 000000000088C340   25                    dq      1e4,1e5,1e6,1e7
7068 00000028 00000000006AF840
7069 00000030 0000000080842E41
7070 00000038 00000000D0126341
7071 00000040 0000000084D79741   26                    dq      1e8,1e9,1e10,1e11
7072 00000048 0000000065CDCD41
7073 00000050 000000205FA00242
7074 00000058 000000E876483742
7075 00000060 000000A2941A6D42   27                    dq      1e12,1e13,1e14,1e15
7076 00000068 000040E59C30A242
7077 00000070 0000901EC4BCD642
7078 00000078 00003420F56B0C43
7079 00000080 0080E03779C34143   28                    dq      1e16,1e17,1e18
7080 00000088 00A0D88557347643
7081 00000090 00C84E676DC1ABC3
7082                             29
7083 00000098                    30    get_power_10    proc
7084                             31
7085 00000098 3D12000000         32            cmp     eax,18          ; Test for 0 <= ax < 19
7086 0000009D 770B               33            ja      out_of_range
7087                             34
7088 0000009F 2EDD04C500000000 R 35            fld     power_table[eax*8]; Get exact value
7089 000000A7 D9F4               36            fxtract                         ; Separate power
7090
7091
7092 7.3.1  Function Partitioning
7093
7094 Three separate modules implement the conversion. Most of the work of the
7095 conversion is done in the module FLOATING_TO_ASCII. The other modules are
7096 provided separately, because they have a more general use. One of them,
7097 GET_POWER_10, is also used by the ASCII to floating-point conversion
7098 routine. The other small module, TOS_STATUS, identifies what, if anything,
7099 is in the top of the numeric register stack.
7100
7101
7102 7.3.2  Exception Considerations
7103
7104 Care is taken inside the function to avoid generating exceptions. Any
7105 possible numeric value is accepted. The only possible exception is
7106 insufficient space on the numeric register stack.
7107
7108 The value passed in the numeric stack is checked for existence, type (NaN
7109 or infinity), and status (denormal, zero, sign). The string size is tested
7110 for a minimum and maximum value. If the top of the register stack is empty,
7111 or the string size is too small, the function returns with an error code.
7112
7113 Overflow and underflow is avoided inside the function for very large or
7114 very small numbers.
7115
7116
7117 7.3.3  Special Instructions
7118
7119 The functions demonstrate the operation of several numeric instructions,
7120 different data types, and precision control. Shown are instructions for
7121 automatic conversion to BCD, calculating the value of 10 raised to an
7122 integer value, establishing and maintaining concurrency, data
7123 synchronization, and use of directed rounding on the NPX.
7124
7125 Without the extended precision data type and built-in exponential function,
7126 the double precision accuracy of this function could not be attained with
7127 the size and speed of the shown example.
7128
7129 The function relies on the numeric BCD data type for conversion from binary
7130 floating-point to decimal. It is not difficult to unpack the BCD digits into
7131 separate ASCII decimal digits. The major work involves scaling the
7132 floating-point value to the comparatively limited range of BCD values. To
7133 print a 9-digit result requires accurately scaling the given value to an
7134 integer between 10^(8) and 10^(9). For example, the number +0.123456789
7135 requires a scaling factor of 10^(9) to produce the value +123456789.0, which
7136 can be stored in 9 BCD digits. The scale factor must be an exact power of
7137 10 to avoid changing any of the printed digit values.
7138
7139 These routines should exactly convert all values exactly representable in
7140 decimal in the field size given. Integer values that fit in the given string
7141 size are not be scaled, but directly stored into the BCD form. Noninteger
7142 values exactly representable in decimal within the string size limits are
7143 also exactly converted. For example, 0.125 is exactly representable in
7144 binary or decimal. To convert this floating-point value to decimal, the
7145 scaling factor is 1000, resulting in 125. When scaling a value, the function
7146 must keep track of where the decimal point lies in the final decimal value.
7147
7148
7149 7.3.4  Description of Operation
7150
7151 Converting a floating-point number to decimal ASCII takes three major
7152 steps: identifying the magnitude of the number, scaling it for the BCD data
7153 type, and converting the BCD data type to a decimal ASCII string.
7154
7155 Identifying the magnitude of the result requires finding the value X such
7156 that the number is represented by I * 10^(X), where 1.0 ¾ I < 10.0. Scaling
7157 the number requires multiplying it by a scaling factor 10^(S), so that the
7158 result is an integer requiring no more decimal digits than provided for in
7159 the ASCII string.
7160
7161 Once scaled, the numeric rounding modes and BCD conversion put the number
7162 in a form easy to convert to decimal ASCII by host software.
7163
7164 Implementing each of these three steps requires attention to detail. To
7165 begin with, not all floating-point values have a numeric meaning. Values
7166 such as infinity, indefinite, or NaN may be encountered by the conversion
7167 routine. The conversion routine should recognize these values and identify
7168 them uniquely.
7169
7170 Special cases of numeric values also exist. Denormals have numeric values,
7171 but should be recognized because they indicate that precision was lost
7172 during some earlier calculations.
7173
7174 Once it has been determined that the number has a numeric value, and it is
7175 normalized (setting appropriate denormal flags, if necessary, to indicate
7176 this to the calling program), the value must be scaled to the BCD range.
7177
7178
7179 7.3.5  Scaling the Value
7180
7181 To scale the number, its magnitude must be determined. It is sufficient to
7182 calculate the magnitude to an accuracy of 1 unit, or within a factor of 10
7183 of the required value. After scaling the number, a check is made to see if
7184 the result falls in the range expected. If not, the result can be adjusted
7185 one decimal order of magnitude up or down. The adjustment test after the
7186 scaling is necessary due to inevitable inaccuracies in the scaling value.
7187
7188 Because the magnitude estimate for the scale factor need only be close, a
7189 fast technique is used. The magnitude is estimated by multiplying the power
7190 of 2, the unbiased floating-point exponent, associated with the number by
7191 log{10}2. Rounding the result to an integer produces an estimate of
7192 sufficient accuracy. Ignoring the fraction value can introduce a maximum
7193 error of 0.32 in the result.
7194
7195 Using the magnitude of the value and size of the number string, the scaling
7196 factor can be calculated. Calculating the scaling factor is the most
7197 inaccurate operation of the conversion process. The relation
7198 10^(X) = 2^(X * log{2}10) is used for this function. The exponentiate
7199 instruction F2XM1 is used.
7200
7201 Due to restrictions on the range of values allowed by the F2XM1
7202 instruction, the power of 2 value is split into integer and fraction
7203 components. The relation 2^(I + F) = 2^(I) * 2^(F) allows using the FSCALE
7204 instruction to recombine the 2^(F) value, calculated through F2XM1, and the
7205 2^(I) part.
7206
7207
7208 7.3.5.1  Inaccuracy in Scaling
7209
7210 The inaccuracy in calculating the scale factor arises because of the
7211 trailing zeros placed into the fraction value of the power of two when
7212 stripping off the integer valued bits. For each integer valued bit in the
7213 power of 2 value separated from the fraction bits, one bit of precision is
7214 lost in the fraction field due to the zero fill occurring in the least
7215 significant bits.
7216
7217 Up to 14 bits may be lost in the fraction because the largest allowed
7218 floating point exponent value is 2^(14) - 1. These bits directly reduce the
7219 accuracy of the calculated scale factor, thereby reducing the accuracy of
7220 the scaled value. For numbers in the range of 10^(±30), a maximum of 8 bits
7221 of precision are lost in the scaling process.
7222
7223
7224 7.3.5.2  Avoiding Underflow and Overflow
7225
7226 The fraction and exponent fields of the number are separated to avoid
7227 underflow and overflow in calculating the scaling values. For example, to
7228 scale 10^(-4932) to 10^(8) requires a scaling factor of 10^(4950), which
7229 cannot be represented by the NPX.
7230
7231 By separating the exponent and fraction, the scaling operation involves
7232 adding the exponents separate from multiplying the fractions. The exponent
7233 arithmetic involves small integers, all easily represented by the NPX.
7234
7235
7236 7.3.5.3  Final Adjustments
7237
7238 It is possible that the power function (Get_Power_10) could produce a
7239 scaling value such that it forms a scaled result larger than the ASCII field
7240 could allow. For example, scaling 9.9999999999999999 * 10^(4900) by
7241 1.00000000000000010 * 10^(-4883) produces 1.00000000000000009 * 10^(18). The
7242 scale factor is within the accuracy of the NPX and the result is within the
7243 conversion accuracy, but it cannot be represented in BCD format. This is why
7244 there is a post-scaling test on the magnitude of the result. The result can
7245 be multiplied or divided by 10, depending on whether the result was too
7246 small or too large, respectively.
7247
7248
7249 7.3.6  Output Format
7250
7251 For maximum flexibility in output formats, the position of the decimal
7252 point is indicated by a binary integer called the power value. If the power
7253 value is zero, then the decimal point is assumed to be at the right of the
7254 rightmost digit. Power values greater than zero indicate how many trailing
7255 zeros are not shown. For each unit below zero, move the decimal point to the
7256 left in the string.
7257
7258 The last step of the conversion is storing the result in BCD and indicating
7259 where the decimal point lies. The BCD string is then unpacked into ASCII
7260 decimal characters. The ASCII sign is set corresponding to the sign of the
7261 original value.
7262
7263
7264 7.4  Trigonometric Calculation Examples (Not Tested)
7265
7266 In this example, the kinematics of a robot arm is modeled with the 4 * 4
7267 homogeneous transformation matrices proposed by Denavit and Hartenberg
7268 J. Denavit and R.S. Hartenberg, "A Kinematic Notation for Lower-Pair
7269 Mechanisms Based on Matrices," J. Applied Mechanics, June 1955, pp. 215-221.
7270
7271 C.S. George Lee, "Robot Arm Kinematics, Dynamics, and Control," IEEE
7272 Computer, Dec. 1982..
7273 The translational and rotational relationships between adjacent links are
7274 described with these matrices using the D-H matrix method. For each link,
7275 there is a 4 * 4 homogeneous transformation matrix that represents the
7276 link's coordinate system (L{i}) at the joint (J{i}) with respect to the
7277 previous link's coordinate system (J{i-1}, L{i-1}). The following four
7278 geometric quantities completely describe the motion of any rigid joint/link
7279 pair (J{i}, L{i}), as Figure 7-7
7280 See page 7-22 in the printed version of this manual. illustrates.
7281
7282   Ú{i} =  The angular displacement of the x{i} axis from the x{i-1} axis by
7283           rotating around the z{i-1} axis (anticlockwise).
7284
7285   d{i} =  The distance from the origin of the (i-1)^(th) coordinate system
7286           along the z{i-1} axis to the x{i} axis.
7287
7288   a{i} =  The distance of the origin of the i^(th) coordinate system from
7289           the z{i-1} axis along the -x{i} axis.
7290
7291   Ó{i} =  The angular displacement of the z{i} axis from the z{i-1} about
7292           the x{i} axis (anticlockwise).
7293
7294 The D-H transformation matrix A=^(i){i-1} for adjacent coordinate frames
7295 (from joint{i-1} to joint{i}) is calculated as follows:
7296
7297   A^(i){i-1}  =  T{z,d} * T{z,Ú} * T{x,a} * T{x,Ó}
7298
7299 ...where...
7300
7301   T{z,d}    represents a translation along the z=i-1 axis
7302
7303   T{z,Ú}    represents a rotation of angle Ú about the z=i-1 axis
7304
7305   T{x,a}    represents a translation along the x{i}axis
7306
7307   T{x,Ó}    represents a rotation of angle Ó about the x{i}axis
7308
7309               � COS Ú{i}   -COS Ó{i}SIN Ú{i}  SIN Ó{i}SIN Ú{i}    COS Ú{i} �
7310 A^(i){i-1} =  � SIN Ú{i}   COS Ó{i}COS Ú{i}   -SIN Ó{i}COS Ú{i}   SIN Ú{i} �
7311               � 0          SIN Ó{i}           COS Ó{i}            d{i}     �
7312               � 0          0                  0                   1        �
7313
7314 The composite homogeneous matrix T which represents the position and
7315 orientation of the joint/link pair with respect to the base system is
7316 obtained by successively multiplying the D-H transformation matrices for
7317 adjacent coordinate frames.
7318
7319   T^(i){0} = A^(1){0}  *  A^(2){1}  * ... *  A^(i){i-1}
7320
7321 This example in Figure 7-8 illustrates how the transformation process can
7322 be accomplished using the 80387. The program consists of two major
7323 procedures. The first procedure TRANS_PROC is used to calculate the elements
7324 in each D-H matrix, A^(i){i-1}.  The second procedure MATRIXMUL_PROC finds
7325 the product of two successive D-H matrices.
7326
7327
7328 Figure 7-8.  Robot Arm Kinematics Example
7329
7330 XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE TOS_STATUS
7331 OBJECT MODULE PLACED IN tos.obj
7332 ASSEMBLER INVOKED BY: asm386 tos.asm
7333
7334 LOC      OBJ                LINE    SOURCE
7335
7336                        1 +1 $title(Determine TOS register contents)
7337                        2    ;
7338                        3    ;         This subroutine will return a value
7339                        4    ;    from 0-15 in eax corresponding
7340                        5    ;         to the contents of NPX TOS.  All
7341                        6    ;    registers are transparent and no
7342                        7    ;         errors are possible.  The return
7343                        8    ;    value corresponds to c3,c2,c1,c0
7344                        9    ;         of FXAM instruction.
7345                       10    ;
7346                       11            name    tos_status
7347 00000000              12            public  tos_status
7348                       13
7349 --------              14    stack           stackseg      6
7350                       15
7351 --------              16    code            segment public er
7352                       17
7353 00000000              18    tos_status      proc
7354                       19
7355 00000000 D9E5         20            fxam                    ; Get status of TOS register
7356 00000002 9BDFE0       21            fstsw  ax       ; Get current status
7357 00000D05 88E0         22            mov     al,ah           ; Put bit 10.8 into bits 2-0
7358 00000007 2507400000   23            and     eax,4007h       ; Mask out bits c3,c2,c1,c0
7359 0000000C C0EC03       24            shr     ah, 3           ; Put bit c3 into bit 11
7360 0000000F 08E0         25            or      al,ah           ; Put c3 into bit 3
7361 00000011 B400         26            mov     ah,0            ; Clear return value
7362 00000013 C3           27            ret
7363                       28
7364 00000014              29    tos_status      endp
7365                       30
7366 --------              31    code            ends
7367                       32                    end
7368
7369 ASSEMBLY COMPLETE,   NO WARNINGS,    NO ERRORS.
7370
7371
7372 LOC      OBJ                LINE    SOURCE
7373
7374                     37                                ; and fraction
7375 000000A9 C3         38            rat             ; OK to leave fxtract running
7376                     39    ;
7377                     40    ;         Calculate the value using the
7378                     41    ;     exponentiate instruction. The following
7379                     42    ;     relations are used:
7380                     43    ;               10**x = 2**(log2(10)*x)
7381                     44    ;               2**(I+F) = 2**I  * 2**F
7382                     45    ;       if st(1) = I and st(0)  = 2**F then
7383                     46    ;       fscale produces 2**(I+F)
7384                     47    ;
7385 000000AA            48    out of range:
7386                     49
7387 000000AA D9E9       50            fld12t                  ; TOS = LOG2(10)
7388 000000AC C8040000   51            enter   4,0
7389                     52
7390                     53        ; save power of 10 value, P
7391 000000B0 8945FC     54        mov     [ebp-4],eax
7392                     55
7393                     56        ; T0S,X = LOG2(10)*P =  LOG2(10**P)
7394 000000B3 DA4DFC     57            fimul   dword ptr [ebp-4]
7395 000000B6 D9E8       58            fld1            ; Set TOS = -1.0
7396 000000B8 D9E0       59            fchs
7397 000000BA D9C1       60            fld     st(1)   ; Copy power value
7398                     61                            ; in base two
7399 000000BC D9FC       62            frndint         ; TOS = I: -inf < I <= X
7400                     63                            ; where I is an integer
7401                     64                            ; Rounding mode does
7402                     65                            ; not matter
7403 0000003E D9CA       66            fxch    st(2)   ; TOS = X, ST(1) = -1.0
7404                     67                                ; ST(2) = I
7405 000000C0 D8E2       68            fsub    st,st(2)    ; T0S,F = X-I:
7406                     69                                ; -1.0 < TOS <= 1.0
7407                     70
7408                     71            ; Restore orignal rounding control
7409 000000C2 58         72            pop     eax
7410 000000C3 D9F0       73            f2xm1                   ; TOS = 2**(F) - 1.0
7411 000000C5 C9         74            leave                   ; Restore stack
7412 000000C6 DEE1       75            fsubr                   ; Form 2**(F)
7413 000000C8 C3         76            rat                     ; OK to leave fsubr running
7414                     77
7415 000000C9            78    get_power_10    endp
7416                     79
7417 --------            80    code            ends
7418                     81                    end
7419
7420 ASSEMBLY COMPLETE,   NO WARNINGS,   NO ERRORS.
7421
7422
7423 XENIX286 80386 MACRO ASSEMBLER V1.0, ASSEMBLY OF MODULE ROT_MATRIX_CAL
7424 OBJECT MODULE PLACED IN transx.obj
7425 ASSEMBLER INVOKED BY: asm386 transx.asm
7426
7427 LOC      OBJ                LINE    SOURCE
7428
7429                                1    Name ROT_MATRIX_CAL
7430                                2
7431                                3
7432                                4
7433                                5    ; This example illustrates the use
7434                                6    ; of the 80387 floating point
7435                                7    ; instructions, in particular, the
7436                                8    ; FSINCOS function which  gives both
7437                                9    ; the SIN and COS values.
7438                                10   ; The program calculates the
7439                                11   ; composite matrix for base to
7440                                12   ; end-effector transformation.
7441                                13   ;
7442                                14   ; Only the kinematics is considered in
7443                                15   ; this example.
7444                                16   ;
7445                                17   ; If the composite matrix mentioned above
7446                                18   ; is given by:
7447                                19   ; T1n = A1 x A2 x ... x An
7448                                20   ; T1n is found by successively calling
7449                                21   ; trans_proc and matrixmul_pro until
7450                                22   ; all matrices have been exhausted.
7451                                23   ;
7452                                24   ; trans_proc calculates entries in each
7453                                25   ; A(A1,...,An) while matrixmul_proc
7454                                26   ; performs the matrix multiplication for
7455                                27   ; Ai and Ai+1. matrixmul_proc in turn
7456                                28   ; calls matrix_row and matrix_elem to
7457                                29   ; do the multiplication.
7458                                30
7459                                31
7460                                32   ; Define stack space
7461                                33
7462 --------                       34   trans_stack stackseg 400
7463                                35
7464                                36   ; Define the matrix structure for
7465                                37   ; 4X4 transformational matrices
7466                                38
7467 --------                       39     a_matrix struc
7468 00000000                       40             a11    dq    ?
7469 00000008                       41             a12    dq    ?
7470 00000010                       42             a13    dq    ?
7471 00000018                       43             a14    dq    ?
7472 00000020                       44             a21    dq    ?
7473 00000028                       45             a22    dq    ?
7474 00000030                       46             a23    dq    ?
7475 00000038                       47             a24    dq    ?
7476 00000040                       48             a31    dq    0h
7477 00000048                       49             a32    dq    ?
7478 00000050                       50             a33    dq    ?
7479 00000058                       51             a34    dq    ?
7480 00000060                       52             a41    dq    0h
7481 00000068                       53             a42    dq    0h
7482 00000070                       54             a43    dq    0h
7483 00000078                       55             a44    dq    1h
7484 --------                       56     a_matrix ends
7485                                57
7486                                58   ; Assume One joint in the storage
7487                                59   ; allocation and hence for
7488                                60   ; two sets of parameters; however,
7489                                61   ; more joints are possible
7490                                62   ;
7491                                63     alp_deg struc
7492 00000000                       64             alpha_deg1 dd  ?
7493 00000004                       65             alpha_deg2 dd
7494 --------                       66     alp_deg ends
7495                                67
7496 --------                       68     tht_deg struc
7497 00000000                       69             theta_deg1 dd  ?
7498 00000004                       70             theta_deg2 dd
7499 --------                       71     tht_deg ends
7500                                72
7501 --------                       73     A_array struc
7502 00000000                       74             A1         dq  ?
7503 00000008                       75             A2         dq  ?
7504 --------                       76     A_array ends
7505                                77
7506 --------                       78     D_array struc
7507 00000000                       79             D1         dq  ?
7508 00000008                       80             D2         dq  ?
7509 --------                       81     D_array ends
7510                                82
7511                                83   ; trans_data is the data segment
7512                                84   ;
7513                                85
7514 -------                        86   trans_data       segment rw public
7515                                87
7516                                88           Amx            a_matrix<>
7517 00000000 ????????????????
7518 00000008 ????????????????
7519 00000010 ????????????????
7520 00000018 ????????????????
7521 00000020 ????????????????
7522 00000028 ????????????????
7523 00000030 ????????????????
7524 00000038 ????????????????
7525 00000040 0000000000000000
7526 00000048 ????????????????
7527 00000050 ????????????????
7528 00000058 ????????????????
7529 00000060 0000000000000000
7530 00000068 0000000000000000
7531 00000070 0000000000000000
7532 00000078 0100000000000000
7533 00000080 ????????????????      89           Bmx            a_matrix<>
7534 00000088 ????????????????
7535 00000090 ????????????????
7536 00000098 ????????????????
7537 000000A0 ????????????????
7538 000000A8 ????????????????
7539 000000B0 ????????????????
7540 000000B8 ????????????????
7541 000000C0 0000000000000000
7542 000000C8 ????????????????
7543 000000D0 ????????????????
7544 000000D8 ????????????????
7545 000000E0 0000000000000000
7546 000000E8 0000000000000000
7547 000000F0 0000000000000000
7548 000000F8 0100000000000000
7549 00000100 ????????????????      90           Tmx            a matrix<>
7550 00000108 ????????????????
7551 00000110 ????????????????
7552 00000118 ????????????????
7553 00000120 ????????????????
7554 00000128 ????????????????
7555 00000130 ????????????????
7556 00000138 ????????????????
7557 00000140 0000000000000000
7558 00000148 ????????????????
7559 00000150 ????????????????
7560 00000158 ????????????????
7561 00000160 0000000000000000
7562 00000168 0000000000000000
7563 00000170 0000000000000000
7564 00000178 0100000000000000
7565 00000180 ????????              91           ALPHA_DEG      alp_deg<>
7566 00000184 ????????
7567 00000188 ????????              92           THETA_DEG      tht_deg<>
7568 0000018C ????????
7569 00000190 ????????????????      93           A_VECT0R       A_array<>
7570 00000198 ????????????????
7571 000001A0 ????????????????      94           D_VECT0R       D_array<>
7572 000001A8 ????????????????
7573 000001B0 00000000              95           ZER0           dd            0
7574 000001B4 B4000000              96           d180           dd          180
7575   0001                         97           NUM_JOIMT      equ           1
7576   0004                         98           NUM_ROW        equ           4
7577   0004                         99           NUM_CDL        equ           4
7578 000001B8 01                   100           REVERSE        db            1h
7579 --------                      101   trans_data ends
7580                               102
7581                               103   assume    ds:trans_data, es:trans_data
7582                               104
7583                               105
7584                               106   ; trans_code contains the procedures
7585                               107   ; for calculating matrix elements and
7586                               108   ; matrix multiplications
7587                               109
7588 --------                      110   trans_code    segment    er public
7589                               111
7590                               112   ; create mnemonics for fsincos which is not
7591                               113   ; yet available from ASM386 as of now
7592                               114
7593    C MACRO                    115   codemacro fsincos
7594        #                      116   dw 0fbd9h
7595        #                      117   endm
7596                               118
7597 00000000                      119   trans_proc proc far
7598                               120
7599                               121
7600                               122       ; Calculate alpha and theta in radians
7601                               123       ; from their values in degrees
7602                               124
7603 00000000 D9EB                 125         fldpi
7604 00000002 D835B4010000    R    126         fdiv    d180
7605                               127
7606                               128       ; Duplicate pi/180
7607 00000008 D9C0                 129         fld     st
7608                               130
7609 0000000A DC0CCD80010000  R    131         fmul    qword ptr ALPHA_DEG[ecx*8]
7610 00000011 D9C9                 132         fxch    st(1)
7611 00000013 DC0CCD88010000  R    133         fmul    qword ptr THETA_DEG[ecx*8]
7612                               134
7613                               135       ; theta(radians) in ST and
7614                               136       ; alpha(radians) in ST(1)
7615                               137
7616                               138       ; Calculate matrix elements
7617                               139       ; a11 = cos theta
7618                               140       ; a12 = - cos alpha * sin theta
7619                               141       ; a13 = sin alpha * sin theta
7620                               142       ; a14 = A * cos theta
7621                               143       ; a21 = sin theta
7622                               144       ; a22 = cos alpha * cos theta
7623                               145       ; a23 = -sin alpha * cos theta
7624                               146       ; a24 = A * sin theta
7625                               147       ; a32 = sin alpha
7626                               148       ; a33 = cos alpha
7627                               149       ; a34 = D
7628                               150       ; a31 = a41 = a42 = a43 = 0.0
7629                               151       ; a44 =1
7630                               152
7631                               153       ; ebx contains the offset for the matrix
7632                               154
7633 0000001A D9FB                 155        fsincos           ;cos theta in ST
7634                               156                          ;sin theta in ST(1)
7635 0000001C D9C0                 157        fld     st        ;duplicate cos theta
7636 0000001E DD13                 158        fst     [ebx].a11 ;cos theta in a11
7637 00000020 DC0CCD90010000  R    159        fmul    qword ptr A_VECTOR[ecx*8]
7638 00000027 DD5B18               160        fstp    [ebx].a14 ;A * cos theta in a14
7639 0000002A D9C9                 161        fxch    st(1)     ;sin theta in ST
7640 0000002C DD5320               162        fst     [ebx].a21 ;sin theta in a21
7641 0000002F D9C0                 163        fld     st        ;duplicate sin theta
7642 00000031 DC0CCD90010000  R    164        fmul    qword ptr A_VECTOR[ecx*8]
7643 00000038 DD5B38               165        fstp    [ebx].a24 ;A * sin theta in a24
7644 0000003B D9C2                 166        fld     st(2)     ;alpha in ST
7645 0000003D D9FB                 167        fsincos           ;cos alpha in ST
7646                               168                          ;sin alpha in ST(1)
7647                               169                          ;sin theta in ST(2)
7648                               170                          ;cos theta in ST(3)
7649 0000003F DD5350               171        fst     [ebx].a33 ;cos alpha in a33
7650 00000042 D9C9                 172        fxch    st(1)     ;sin alpha in ST
7651 00000044 DD5348               173        fat     [ebx].a32 ;sin alpha in a32
7652 00000047 D9C2                 174        fld     ST(2)     ;sin theta in ST
7653                               175                          ;sin alpha in ST(1)
7654 00000049 D8C9                 176        fmul    st,st(1)  ;sin alpha * sin theta
7655 0000004B DD5B10               177        fstp    [ebx].a13 ;stored in a13
7656 0000004E D8CB                 178        fmul    st,st(3)  ;cos theta * sin alpha
7657 00000050 D9E0                 179        fchs              ;-cos theta * sin alpha
7658 00000052 DD5B30               180        fstp    [ebx].a23 ;stored in a23
7659 00000055 D9C2                 181        fld     st(2)     ;cos theta in ST
7660                               182                          ;cos alpha in ST(1)
7661                               183                          ;sin theta in ST(2)
7662                               184                          ;cos theta in ST(3)
7663 00000057 D8C9                 185        fmul    st,st(1)  ;cos theta * cos alpha
7664 00000059 DD5B28               186        fstp    [ebx].a22 ;stored in a22
7665 0000005C D8C9                 187        fmul    st,st(1)  ;cos alpha * sin theta
7666                               188       ;
7667                               189       ; To take advantage of parallel operations
7668                               190       ; between the CPU and NPX
7669                               191       ;
7670 0000005E 50                   192        push    eax  ; save eax
7671                               193       ;
7672                               194       ; also move D into a34 in a faster way
7673 0000005F 8B04CDA0010000  R    195        mov     eax, dword ptr D_VECTOR[ecx*8]
7674 00000066 894358               196        mov     dword ptr [ebx + 88], eax
7675 00000069 8B04CDA4010000  R    197        mov     eax, dword ptr D VECTOR[ecx*8 + 4]
7676 00000070 89435C               198        mov     dword ptr [ebx + 92], eax
7677 00000073 58                   199        pop     eax  ; restore eax
7678 00000074 D9E0                 200        fchs              ;-cos alpha * sin theta
7679 00000076 DD5B08               201        fstp    [ebx].a12 ;stored in a12
7680                               202                          ;and all nonzero elements
7681                               203                          ;have been calculated
7682 00000079 CB                   204        rat
7683                               205
7684 0000007A                      206   trans_proc endp
7685                               207
7686                               208
7687 0000007A                      209   matrix_elem proc far
7688                               210
7689                               211       ; This procedure calculate the dot product
7690                               212       ; of the ith row of the first matrix and
7691                               213       ; the jth column of the second matrix:
7692                               214       ;
7693                               215       ; Tij where Tij = sum of Aik x Bkj over k
7694                               216       ;
7695                               217       ; parameters passed from the calling routine,
7696                               218       ; matrix_row:
7697                               219       ; ESI = (i-1)*8
7698                               220       ; EDI = (j-1)*8
7699                               221       ; local register, EBP = (k-1)*8
7700                               222       ;
7701 0000007A 55                   223         push    ebp     ; save ebp
7702 0000007B 51                   224         push    ecx     ; ecx to be used as a tmp reg
7703 0000007C 8BCE                 225         mov     ecx, esi; save it for later indexing
7704                               226
7705                               227       ; locating the element in the first matrix, A
7706 0000007E 6BC904               228         imul    ecx, NUM_COL   ; ecx contains offset due
7707                               229                                ; to preceding rows; the
7708                               230                                ; offset is from the
7709                               231                                ; beginning of the matrix
7710                               232
7711 00000081 31ED                 233         xor     ebp, ebp; clear ebp, which will be
7712                               234                         ; used a temp reg to index( k)
7713                               235                         ; across the ith row of the first
7714                               236                         ; matrix as well as down the jth
7715                               237                         ; column of the second matrix
7716                               238
7717                               239       ; clear Tij for accumulating Aik*Bkj
7718 00000083 892C39               240         mov     dword ptr [ecx][edi],ebp
7719 00000086 896C3904             241         mov     dword ptr [ecx][edi+4], ebp
7720                               242
7721 0000008A 51                   243         push    ecx     ; save on stack: esi * num_col =
7722                               244                         ; the offset of the beginning
7723                               245                         ; of the ith row from the
7724                               246                         ; beginning of the A matrix
7725                               247
7726 0000008B                      248   NXT_k:
7727 0000008B 01E9                 249         add     ecx, ebp ; get to the kth column entry
7728                               250                          ; of the ith row of the A matrix
7729                               251
7730                               252       ; load AiK into 80387
7731 0000008D DD0408               253         fld     qword ptr [eax][ecx]
7732                               254
7733                               255       ; locating  Bkj
7734 00000090 8BCD                 256         mov     ecx, ebp
7735 00000092 6BC904               257         imul    ecx, NUM_ROW ; ecx contains the offset
7736                               258                              ; of the beginning of the
7737                               259                              ; kth row from the
7738                               260                              ; beginning of the B matrix
7739 00000095 01F9                 261         add    ecx, edi      ; get to the jth column entry
7740                               262                              ; of the kth row of the B
7741                               263                              ; matrix
7742 00000097 DC0C0B               264         fmul   qword ptr [ebx][ecx]; Aik * Bkj
7743 0000009A 59                   265         pop    ecx           ; esi * num_col
7744                               266                              ; in ecx again
7745 0000009B 51                   267         push   ecx           ; also at top of program
7746                               268                              ; stack
7747                               269
7748                               270       ; add to the result in the output matrix, Tij
7749 0000009C 01F9                 271         add    ecx, edi
7750                               272
7751                               273       ; accumulating the sum of Aik * Bkj
7752 0000009E DC040A               274         fadd   qword ptr [edx][ecx]
7753 000000A1 DD1C0A               275         fstp   qword ptr [edx][ecx]
7754                               276       ; increment k by 1, i.e., ebp by 8
7755 000000A4 83C508               277         add    ebp, 8
7756                               278
7757                               279       ; Has k reached the width of the matrix yet?
7758 000000A7 83FD20               280         cmp    ebp, NUM_COL*8
7759 000000AA 7CDF                 281         jl     NXT_k
7760                               282
7761                               283       ; Restore registers
7762 000000AC 59                   284         pop    ecx      ; clear esi*num_col from stack
7763 000000AD 59                   285         pop    ecx      ; restore ecx
7764 000000AE 5D                   286         pop    ebp      ; restore ebp
7765 000000AF CB                   287         ret
7766                               288
7767 000000B0                      289   matrix_elem endp
7768                               290
7769                               291
7770 000000B0                      292   matrix_row proc far
7771                               293
7772 000000B0 31FF                 294         xor    edi, edi
7773                               295       ; scan across a row
7774                               296
7775 000000B2                      297   NXT_COL:
7776 000000B2 9A7A000000....  R    298         call   matrix_elem
7777 000000B9 83C708               299         add    edi, 8
7778 000000BC 83FF20               300         cmp    edi, NUM_COL*8
7779 000000BF 7CF1                 301         jl     NXT_COL
7780 000000C1 CB                   302         ret
7781                               303
7782 000000C2                      304   matrix_row endp
7783                               305
7784                               306
7785 000000C2                      307   matrixmul_proc proc far
7786                               308
7787                               309       ; This procedure does the matrix
7788                               310       ; multiplication by calling matrix_row
7789                               311       ; to calculate entries in each row
7790                               312       ;
7791                               313       ; The matrix multiplication is
7792                               314       ; performed in the following manner,
7793                               315       ;   Tij = Aik x Bkj
7794                               316       ; where i and j denote the row and column
7795                               317       ; respectively and k is the index for
7796                               318       ; scanning across the ith row of the
7797                               319       ; first matrix and the jth column of the
7798                               320       ; second matrix.
7799 000000C2 5A                   321         pop     edx ; offset Tmx in edx
7800 000000C3 5B                   322         pop     ebx ; offset Bmx in ebx
7801 000000C4 58                   323         pop     eax ; offset Amx in eax
7802                               324
7803                               325       ; setup esi and edi
7804                               326       ; edi points to the column
7805                               327       ; eai points to the row
7806                               328
7807 000000C5 31F6                 329         xor     esi, esi ; clear esi
7808                               330
7809 000000C7                      331   NXT_ROW:
7810 000000C7 9AB0000000----  R    332         call    matrix_row
7811 000000CE 83C608               333         add     esi, 8
7812 000000D1 83FE20               334         cmp     esi, NUM_ROW*8
7813 000000D4 7CF1                 335         jl      NXT_ROW
7814 000000D6 CB                   336         ret
7815                               337
7816 000000D7                      338   matrixmul_proc endp
7817                               339
7818                               340
7819 --------                      341   trans_code ends
7820                               342
7821                               343   ;***************************************
7822                               344   ;                                      ;
7823                               345   ;                                      ;
7824                               346   ;                                      ;
7825                               347   ;             Main program             ;
7826                               348   ;                                      ;
7827                               349   ;                                      ;
7828                               350   ;                                      ;
7829                               351   ;***************************************
7830                               352
7831 --------                      353   main_code segment er
7832                               354
7833 00000000                      355   START:
7834                               356
7835 00000000 BC00000000      R    357         mov  esp,  stackstart trans_stack
7836                               358       ; save all registers
7837                               359
7838 00000005 60                   360         pushed
7839                               361
7840                               362       ; ECX denotes the number of joints
7841                               363       ; where no of matrices = NUM_JOINT + 1
7842                               364       ; Find the first matrix( from the base
7843                               365       ; of the system to the first joint)
7844                               366       ; and call it Bmx
7845 00000006 31C9                 367         xor  ecx, ecx          ; 1st matrix
7846 00000008 BB80000000      R    368         mov  ebx, offset Bmx  ;
7847 0000000D 9A00000000----  R    369         call trans_proc        ; is Bmx
7848 00000014 41                   370         inc  ecx
7849                               371
7850 00000015                      372   NXT MATRIX:
7851                               373       ; From the 2nd matrix and on, it
7852                               374       ; will be stored in Amx.
7853                               375       ; The result from the first matrix mult.
7854                               376       ; is stored in Tmx but will be accessed
7855                               377       ; as Bmx in the next multiplication.
7856                               378       ; As a matter of fact, the roles of Bmx
7857                               379       ; and Tmx alternate in successive
7858                               380       ; multiplications. This is achieved by
7859                               381       ; reversing the order of the Bmx and Tmx
7860                               382       ; pointers being passed onto the program
7861                               383       ; stack:  Thus, this is invisible to the
7862                               384       ; matrix multiplication procedure.
7863                               385       ; REVERSE serves as the indicator;
7864                               386       ; REVERSE = 0 means that the result
7865                               387       ;             is to placed in Tmx.
7866                               388
7867 00000015 BB00000000      R    389         mov     ebx, offset Amx  ;find Amx
7868 0000001A 9A00000000----  R    390         call    trans_proc
7869 00000021 41                   391         inc     ecx
7870 00000022 8035B801000001  R    392         xor     REVERSE, 1h
7871 00000029 7511                 393         jnz     Bmx_as_Tmx
7872                               394
7873                               395       ; no reversing.  Bmx as the second input
7874                               396       ; matrix while Tmx as the output matrix.
7875 0000002B 6800000000      R    397         push    offset Amx
7876 00000030 6880000000      R    398         push    offset Bmx
7877 00000035 6800010000      R    399         push    offset Tmx
7878 0000003A EB0F                 400         jmp     CONTINUE
7879                               481
7880                               402       ; reversing. Tmx as the second input
7881                               403       ; matrix while Bmx as the output matrix.
7882 0000003C                      404   Bmx_as_Tmx:
7883 0000003C 6800000000      R    405         push    offset Amx
7884 00000041 6800010000      R    406         push    offset Tmx  ;reversing the
7885 00000046 6880000000      R    407         push    offset Bmx  ;pointers passed
7886                               408
7887 UUUUUU4B                      409   CONTINUE:
7888 0000004B 9AC2000000----  R    410         call    matrixmul_proc
7889 00000052 83F901               411         cmp     ecx, NUM_JOINT
7890 00000055 7EBE                 412         jle     NXT_MATRIX
7891                               413
7892                               414       ; if REVERSE = 1 then the final answer
7893                               415       ; will be in Bmx otherwise, in Tmx.
7894                               416
7895 00000057 61                   417         popad
7896                               418
7897 --------                      419   main_code  ends
7898                               420
7899                               421   end START, ds:trans data, ss:trans stack
7900
7901 ASSEMBLY COMPLETE,   NO WARNINGS,   NO ERRORS.
7902
7903
7904 Appendix A  Machine Instruction Encoding and Decoding
7905
7906 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
7907
7908 ’‘‘1st  Byte‘‘“
7909 Hex  Binary      2nd Byte      Bytes 3-7    ASM386 Instruction Format
7910
7911 D8   1101 1000   MOD 000 R/M   SIB, displ   FADD       single-real
7912 D8   1101 1000   MOD 001 R/M   SIB, displ   FMUL       single-real
7913 D8   1101 1000   MOD 010 R/M   SIB, displ   FCOM       single-real
7914 D8   1101 1000   MOD 011 R/M   SIB, displ   FCOMP      single-real
7915 D8   1101 1000   MOD 100 R/M   SIB, displ   FSUB       single-real
7916 D8   1101 1000   MOD 101 R/M   SIB, displ   FSUBR      single-real
7917 D8   1101 1000   MOD 110 R/M   SIB, displ   FDIV       single-real
7918 D8   1101 1000   MOD 111 R/M   SIB, displ   FDIVR      single-real
7919 D8   1101 1000   1100 0 REG                 FADD       ST,ST(i)
7920 D8   1101 1000   1100 1 REG                 FMUL       ST,ST(i)
7921 D8   1101 1000   1101 0 REG                 FCOM       ST(i)
7922 D8   1101 1000   1101 1 REG                 FCOMP      ST(i)
7923 D8   1101 1000   1110 0 REG                 FSUB       ST,ST(i)
7924 D8   1101 1000   1110 1 REG                 FSUBR      ST,ST(i)
7925 D8   1101 1000   1111 0 REG                 FDIV       ST,ST(i)
7926 D8   1101 1000   1111 1 REG                 FDIVR      ST,ST(i)
7927 D9   1101 1001   MOD 000 R/M   SIB, displ   FLD        single-real
7928 D9   1101 1001   MOD 001 R/M                reserved
7929 D9   1101 1001   MOD 010 R/M   SIB, displ   FST        single-real
7930 D9   1101 1001   MOD 011 R/M   SIB, displ   FSTP       single-real
7931 D9   1101 1001   MOD 100 R/M   SIB, displ   FLDENV     14 or 28 bytes
7932 The size of operand transferred depends on the 80386 operand-size
7933 attribute in effect for the instruction.
7934
7935
7936
7937
7938
7939 D9   1101 1001   MOD 101 R/M   SIB, displ   FLDCW      2 bytes
7940 D9   1101 1001   MOD 110 R/M   SIB, displ   FSTENV     14 or 28 bytes
7941 The size of operand transferred depends on the 80386 operand-size
7942 attribute in effect for the instruction.
7943
7944
7945
7946
7947
7948 D9   1101 1001   MOD 111 R/M   SIB, displ   FSTCW      2 bytes
7949 D9   1101 1001   1100 0 REG                 FLD        ST(i)
7950 D9   1101 1001   1100 1 REG                 FXCH       ST(i)
7951 D9   1101 1001   1101 0000                  FNOP
7952 D9   1101 1001   1101 0001                  reserved
7953 D9   1101 1001   1101 001-                  reserved
7954 D9   1101 1001   1101 01--                  reserved
7955 D9   1101 1001   1101 1 REG                 reserved
7956 D9   1101 1001   1110 0000                  FCHS
7957 D9   1101 1001   1110 0001                  FABS
7958 D9   1101 1001   1110 001-                  reserved
7959 D9   1101 1001   1110 0100                  FTST
7960 D9   1101 1001   1110 0101                  FXAM
7961 D9   1101 1001   1110 011-                  reserved
7962 D9   1101 1001   1110 1000                  FLD1
7963 D9   1101 1001   1110 1001                  FLDL2T
7964 D9   1101 1001   1110 1010                  FLDL2E
7965 D9   1101 1001   1110 1011                  FLDPI
7966 D9   1101 1001   1110 1100                  FLDLG2
7967 D9   1101 1001   1110 1101                  FLDLN2
7968 D9   1101 1001   1110 1110                  FLDZ
7969 D9   1101 1001   1110 1111                  reserved
7970 D9   1101 1001   1111 0000                  F2XM1
7971 D9   1101 1001   1111 0001                  FYL2X
7972 D9   1101 1001   1111 0010                  FPTAN
7973 D9   1101 1001   1111 0011                  FPATAN
7974 D9   1101 1001   1111 0100                  FXTRACT
7975 D9   1101 1001   1111 0101                  FPREM1
7976 D9   1101 1001   1111 0110                  FDECSTP
7977 D9   1101 1001   1111 0111                  FINCSTP
7978 D9   1101 1001   1111 1000                  FPREM
7979 D9   1101 1001   1111 1001                  FYL2XP1
7980 D9   1101 1001   1111 1010                  FSQRT
7981 D9   1101 1001   1111 1011                  FSINCOS
7982 D9   1101 1001   1111 1100                  FRNDINT
7983 D9   1101 1001   1111 1101                  FSCALE
7984 D9   1101 1001   1111 1110                  FSIN
7985 D9   1101 1001   1111 1111                  FCOS
7986 DA   1101 1010   MOD 000 R/M   SIB, displ   FIADD      short-integer
7987 DA   1101 1010   MOD 001 R/M   SIB, displ   FIMUL      short-integer
7988 DA   1101 1010   MOD 010 R/M   SIB, displ   FICOM      short-integer
7989 DA   1101 1010   MOD 011 R/M   SIB, displ   FICOMP     short-integer
7990 DA   1101 1010   MOD 100 R/M   SIB, displ   FISUB      short-integer
7991 DA   1101 1010   MOD 101 R/M   SIB, displ   FISUBR     short-integer
7992 DA   1101 1010   MOD 110 R/M   SIB, displ   FIDIV      short-integer
7993 DA   1101 1010   MOD 111 R/M   SIB, displ   FIDIVR     short-integer
7994 DA   1101 1010   110- ----                  reserved
7995 DA   1101 1010   1110 0---                  reserved
7996 DA   1101 1010   1110 1000                  reserved
7997 DA   1010 1010   1110 1001                  FUCOMPP
7998 DA   1101 1010   1110 101-                  reserved
7999 DA   1101 1010   1110 11--                  reserved
8000 DA   1101 1010   1111 ----                  reserved
8001 DB   1101 1011   MOD 000 R/M   SIB, displ   FILD       short-integer
8002 DB   1101 1011   MOD 001 R/M   SIB, displ   reserved
8003 DB   1101 1011   MOD 010 R/M   SIB, displ   FIST       short-integer
8004 DB   1101 1011   MOD 011 R/M   SIB, displ   FISTP      short-integer
8005 DB   1101 1011   MOD 100 R/M   SIB, displ   reserved
8006 DB   1101 1011   MOD 101 R/M   SIB, displ   FLD        extended-real
8007 DB   1101 1011   MOD 110 R/M   SIB, displ   reserved
8008 DB   1101 1011   MOD 111 R/M   SIB, displ   FSTP       extended-real
8009 DB   1101 1011   110- ----                  reserved
8010 DB   1101 1011   1110 0000
8011 This encoding can be generated by the language translators;
8012 however, the 80387 treats it as FNOP. It corresponds to the following
8013 8087 or 80287 instructions: FENI.
8014
8015
8016
8017
8018
8019 DB   1101 1011   1110 0001
8020 This encoding can be generated by the language translators;
8021 however, the 80387 treats it as FNOP. It corresponds to the following
8022 8087 or 80287 instructions: FEDISI.
8023
8024
8025
8026
8027
8028 DB   1101 1011   1110 0010                  FCLEX
8029 DB   1101 1011   1110 0011                  FINIT
8030 DB   1101 1011   1110 0100
8031 This encoding can be generated by the language translators;
8032 however, the 80387 treats it as FNOP. It corresponds to the following
8033 8087 or 80287 instructions: FSETPM.
8034
8035
8036
8037
8038
8039 DB   1101 1011   1110 0101                  reserved
8040 DB   1101 1011   1110 011-                  reserved
8041 DB   1101 1011   1110 1---                  reserved
8042 DB   1101 1011   1111 ----                  reserved
8043 DC   1101 1100   MOD 000 R/M   SIB, displ   FADD       double-real
8044 DC   1101 1100   MOD 001 R/M   SIB, displ   FMUL       double-real
8045 DC   1101 1100   MOD 010 R/M   SIB, displ   FCOM       double-real
8046 DC   1101 1100   MOD 011 R/M   SIB, displ   FCOMP      double-real
8047 DC   1101 1100   MOD 100 R/M   SIB, displ   FSUB       double-real
8048 DC   1101 1100   MOD 101 R/M   SIB, displ   FSUBR      double-real
8049 DC   1101 1100   MOD 110 R/M   SIB, displ   FDIV       double-real
8050 DC   1101 1100   MOD 111 R/M   SIB, displ   FDIVR      double-real
8051 DC   1101 1100   1100 0 REG                 FADD       ST(i),ST
8052 DC   1101 1100   1100 1 REG                 FMUL       ST(i),ST
8053 DC   1101 1100   1101 0 REG                 reserved
8054 DC   1101 100    1101 1 REG                 reserved
8055 DC   1101 1100   1110 0 REG                 FSUBR      ST(i),ST
8056 DC   1101 1100   1110 1 REG                 FSUB       ST(i),ST
8057 DC   1101 1100   1111 0 REG                 FDIVR      ST(i),ST
8058 DC   1101 1100   1111 1 REG                 FDIV       ST(i),ST
8059 DD   1101 1101   MOD 000 R/M   SIB, displ   FLD        double-real
8060 DD   1101 1101   MOD 001 R/M                reserved
8061 DD   1101 1101   MOD 010 R/M   SIB, displ   FST        double-real
8062 DD   1101 1101   MOD 011 R/M   SIB, displ   FSTP       double-real
8063 DD   1101 1101   MOD 100 R/M   SIB, displ   FRSTOR     94 or 108 bytes
8064 The size of operand transferred depends on the 80386 operand-size
8065 attribute in effect for the instruction.
8066
8067
8068
8069
8070
8071 DD   1101 1101   MOD 101 R/M   SIB, displ   reserved
8072 DD   1101 1101   MOD 110 R/M   SIB, displ   FSAVE      94 or 108 bytes
8073 The size of operand transferred depends on the 80386 operand-size
8074 attribute in effect for the instruction.
8075
8076
8077
8078
8079
8080 DD   1101 1101   MOD 111 R/M   SIB, displ   FSTSW      2 bytes
8081 DD   1101 1101   1100 0 REG                 FFREE      ST(i)
8082 DD   1101 1101   1100 1 REG                 reserved
8083 DD   1101 1101   1101 0 REG                 FST        ST(i)
8084 DD   1101 1101   1101 1 REG                 FSTP       ST(i)
8085 DD   1101 1101   1110 0 REG                 FUCOM      ST(i)
8086 DD   1101 1101   1110 1 REG                 FUCOMP     ST(i)
8087 DD   1101 1101   1111 ----                  reserved
8088 DE   1101 1110   MOD 000 R/M   SIB, displ   FIADD      word-integer
8089 DE   1101 1110   MOD 001 R/M   SIB, displ   FIMUL      word-integer
8090 DE   1101 1110   MOD 010 R/M   SIB, displ   FICOM      word-integer
8091 DE   1101 1110   MOD 011 R/M   SIB, displ   FICOMP     word-integer
8092 DE   1101 1110   MOD 100 R/M   SIB, displ   FISUB      word-integer
8093 DE   1101 1110   MOD 101 R/M   SIB, displ   FISUBR     word-integer
8094 DE   1101 1110   MOD 110 R/M   SIB, displ   FIDIV      word-integer
8095 DE   1101 1110   MOD 111 R/M   SIB, displ   FIDIVR     word-integer
8096 DE   1101 1110   1100 0 REG                 FADDP      ST(i),ST
8097 DE   1101 1110   1100 1 REG                 FMULP      ST(i),ST
8098 DE   1101 1110   1101 0---                  reserved
8099 DE   1101 1110   1101 1000                  reserved
8100 DE   1101 1110   1101 1001                  FCOMPP
8101 DE   1101 1110   1101 101-                  reserved
8102 DE   1101 1110   1101 11--                  reserved
8103 DE   1101 1110   1110 0 REG                 FSUBRP     ST(i),ST
8104 DE   1101 1110   1110 1 REG                 FSUBP      ST(i),ST
8105 DE   1101 1110   1111 0 REG                 FDIVRP     ST(i),ST
8106 DE   1101 1110   1111 1 REG                 FDIVP      ST(i),ST
8107 DF   1101 1111   MOD 000 R/M   SIB, displ   FILD       word-integer
8108 DF   1101 1111   MOD 001 R/M   SIB, displ   reserved
8109 DF   1101 1111   MOD 010 R/M   SIB, displ   FIST       word-integer
8110 DF   1101 1111   MOD 011 R/M   SIB, displ   FISTP      word-integer
8111 DF   1101 1111   MOD 100 R/M   SIB, displ   FBLD       packed-decimal
8112 DF   1101 1111   MOD 101 R/M   SIB, displ   FILD       long-integer
8113 DF   1101 1111   MOD 110 R/M   SIB, displ   FBSTP      packed-decimal
8114 DF   1101 1111   MOD 111 R/M   SIB, displ   FISTP      long-integer
8115 DF   1101 1111   1100 0 REG                 reserved
8116 DF   1101 1111   1100 1 REG                 reserved
8117 DF   1101 1111   1101 0 REG                 reserved
8118 DF   1101 1111   1101 1 REG                 reserved
8119 DF   1101 1111   1110 0000                  FSTSW AX
8120 DF   1101 1111   1110 0001                  reserved
8121 DF   1101 1111   1110 001-                  reserved
8122 DF   1101 1111   1110 01--                  reserved
8123 DF   1101 1111   1110 1---                  reserved
8124 DF   1101 1111   1111 ----                  reserved
8125
8126
8127 Appendix B  Exception Summary
8128
8129 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8130
8131 The following table lists the instruction mnemonics in alphabetical order.
8132 For each mnemonic, it summarizes the exceptions that the instruction may
8133 cause. When writing 80387 programs that may be used in an environment that
8134 employs numerics exception handlers, assembly-language programmers should be
8135 aware of the possible exceptions for each instruction in order to determine
8136 the need for exception synchronization. Chapter 4 explains the need for
8137 exception synchronization.
8138
8139
8140 Mnemonic            Instruction             IS
8141 IS‘‘Invalid operand due to stack overflow/underflow  I
8142 I‘‘Invalid operand due to other cause  D
8143 D‘‘Denormal operand  Z
8144 Z‘‘Zero-divide  O
8145 O‘‘Overflow  U
8146 U‘‘Underflow  P
8147 P‘‘Inexact result (precision)
8148
8149
8150
8151
8152
8153
8154 F2XM1               2^(X) - 1                Y   Y   Y           Y   Y
8155 FABS                Absolute value           Y
8156 FADD(P)             Add real                 Y   Y   Y       Y   Y   Y
8157 FBLD                BCD load                 Y
8158 FBSTP               BCD store and pop        Y   Y                   Y
8159 FCHS                Change sign              Y
8160 FCLEX               Clear exceptions
8161 FCOM(P)(P)          Compare real             Y   Y   Y
8162 FCOS                Cosine                   Y   Y   Y           Y   Y
8163 FDECSTP             Decrement stack pointer
8164 FDIV(R)(P)          Divide real              Y   Y   Y   Y   Y   Y   Y
8165 FFREE               Free register
8166 FIADD               Integer add              Y   Y   Y       Y   Y   Y
8167 FICOM(P)            Integer compare          Y   Y   Y
8168 FIDIV               Integer divide           Y   Y   Y   Y       Y   Y
8169 FIDIVR              Integer divide reversed  Y   Y   Y   Y   Y   Y   Y
8170 FILD                Integer load             Y
8171 FIMUL               Integer multiply         Y   Y   Y   Y   Y   Y
8172 FINCSTP             Increment stack pointer
8173 FINIT               Initialize processor
8174 FIST(P)             Integer store            Y   Y                   Y
8175 FISUB(R)            Integer subtract         Y   Y   Y       Y   Y   Y
8176 FLD extended
8177 or stack            Load real                Y
8178 FLD single
8179 or double           Load real                Y   Y   Y
8180 FLD1                Load + 1.0               Y
8181 FLDCW               Load Control word        Y   Y   Y   Y   Y   Y   Y
8182 FLDENV              Load environment         Y   Y   Y   Y   Y   Y   Y
8183 FLDL2E              Load log{2}e             Y
8184 FLDL2T              Load log{2}10            Y
8185 FLDLG2              Load log{10}2            Y
8186 FLDLN2              Load log{e}2             Y
8187 FLDPI               Load Ò                   Y
8188 FLDZ                Load + 0.0               Y
8189 FMUL(P)             Multiply real            Y   Y   Y       Y   Y   Y
8190 FNOP                No operation
8191 FPATAN              Partial arctangent       Y   Y   Y           Y   Y
8192 FPREM               Partial remainder        Y   Y   Y           Y
8193 FPREM1              IEEE partial remainder   Y   Y   Y           Y
8194 FPTAN               Partial tangent          Y   Y   Y           Y   Y
8195 FRNDINT             Round to integer         Y   Y   Y               Y
8196 FRSTOR              Restore state            Y   Y   Y   Y   Y   Y   Y
8197 FSAVE               Save state
8198 FSCALE              Scale                    Y   Y   Y       Y   Y   Y
8199 FSIN                Sine                     Y   Y   Y           Y   Y
8200 FSINCOS             Sine and cosine          Y   Y   Y           Y   Y
8201 FSQRT               Square root              Y   Y   Y               Y
8202 FST(P) stack
8203 or extended         Store real               Y
8204 FST(P) single
8205 or double           Store real               Y   Y   Y       Y   Y   Y
8206 FSTCW               Store control word
8207 FSTENV              Store Environment
8208 FSTSW (AX)          Store status word
8209 FSUB(R)(P)          Subtract real            Y   Y   Y       Y   Y   Y
8210 FTST                Test                     Y   Y   Y
8211 FUCOM(P)(P)         Unordered compare real   Y   Y   Y
8212 FWAIT               CPU Wait
8213 FXAM                Examine
8214 FXCH                Exchange registers       Y
8215 FXTRACT             Extract                  Y   Y   Y   Y
8216 FYL2X               Y * log{2}X              Y   Y   Y   Y   Y   Y   Y
8217 FYL2XP1             Y * log{2}(X + 1)        Y   Y   Y           Y   Y
8218
8219
8220 Appendix C  Compatibility Between the 80387 and the 80287/8087
8221
8222 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8223
8224 This appendix summarizes the differences between the 80387 and its
8225 predecessors the 80287 and the 8087, and analyzes the impact of these
8226 differences on software that must be transported from the 80287 or 8087 to
8227 the 80387. Any migration from the 8087 directly to the 80387 must also take
8228 into account the additional differences between the 8087 and the 80387 as
8229 listed in Appendix D of this manual.
8230
8231
8232                 ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘Difference Description‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“
8233 Issue            80387 Behavior                  8087/80287 Behavior          Impact on Software              Reason for the Difference
8234
8235 C.1  INITIALIZATION SEQUENCE
8236
8237 RESET,           After a hardware RESET,         No difference between        80387 initialization            Permits the 80386 to
8238 FINIT,           the ERROR# output is            RESET and FINIT.             software must execute an        differentiate between the 80287
8239 and              asserted to indicate that an                                 FNINIT instruction to clear     and the 80387.
8240 ERROR#           80387 is present. To                                         ERROR#. The FNINIT is
8241 PIN              accomplish this, the IE and                                  not required for 80287/8087
8242                  ES bits of the status word                                   software, though Intel
8243                  are set, and the IM bit in                                   documentation
8244                  the control word is reset.                                   recommends its use (refer to
8245                  After FINIT, the status                                      the Numerics Supplement to
8246                  word and the control word                                    the iAPX 286 Programmer's
8247                  have the same values as in                                   Reference Manual.)
8248                  an 80287/8087 after
8249                  RESET.
8250
8251 C.2  DATA TYPES AND EXCEPTION HANDLING
8252
8253 NaN              The 80387 distinguishes         The 80287/8087 only          Uninitialized memory            IEEE Standard 754
8254                  between signaling NaNs          generates one kind of NaN    locations that contain          compatibility.
8255                  and quiet NaNs. The 80387       (the equivalent of a quiet   QNaNs should be changed
8256                  only generates quiet NaNs.      NaN) but raises an           to SNaNs to cause the
8257                  An invalid-operation            invalid-operation exception  80387 to fault when
8258                  exception is raised only        upon encountering any kind   uninitialized memory locations
8259                  upon encountering a             of NaN.                      are referenced.
8260                  signaling NaN (except for
8261                  FCOM, FIST, and FBSTP
8262                  which also raise IE for
8263                  quiet NaNs).
8264
8265 Pseudozero,      The 80387 neither               The 80287/8087 defines       None. The 80387 does not        IEEE Standard 754
8266 Pseudo-NaN,      generates not supports these    and supports special         generate these formats,         compatibility.
8267 Pseudoinfinity,  formats; it raises an           handling for these formats.  and therefore will not
8268 and Unnormal     invalid-operation exception                                  encounter them unless a
8269 Formats          whenever it encounters                                       programmer deliberately
8270                  them in an arithmetic                                        enters them.
8271                  operation.
8272
8273 Tag Word         The encoding in the tag         The encoding for pseudo-     The exception handler may       IEEE Standard 754
8274 Bits for         word for the unsupported        zero and unnormal is         need to be changed if           compatibility.
8275 Unsupported      data formats mentioned in       "valid" (type 00); the       programmers use such
8276 Data             Section C.2.2 is "special       others are"special data"     data types.
8277 Formats          data" (type 10).                (type 10).
8278
8279 Invalid-         No invalid-operation            Upon encountering a          None. Software on the           Upgrade, to eliminate
8280 Operation        exception is raised upon        denormal in FSQRT, FDIV,     80387 will continue to          exception.
8281 Exception        encountering a denormal in      or FPREM or upon             execute in cases where the
8282                  FSQRT, FDIV, or FPREM           conversion to BCD or to      80287/8087 would trap.
8283                  or upon conversion to           integer, the invalid-
8284                  BCD or to integer. The          operation exception is
8285                  operation proceeds by first     raised.
8286                  normalizing the value.
8287
8288 Denormal         The denormal exception is       The denormal exception is    The exception handler           Performance enhancement
8289 Exception        raised in transcendental        not raised in transcendental needs to be changed only        for normal case.
8290                  instructions and FXTRACT.       instructions and FXTRACT.    if it gives special treatment
8291                                                                               to different opcodes.
8292
8293 Overflow         Overflow exception              Overflow exception           Overflow exception              IEEE Standard 754
8294 Exception        masked.                         masked.                      masked.                         compatibility.
8295                  If the rounding mode is set     The 80287/8087 does not      Under the most common
8296                  to chop (toward zero), the      signal the overflow          rounding modes, no
8297                  result is the most positive     exception when the masked    impact. If rounding is
8298                  or most negative number.        response is not infinity;    toward zero (chop), a
8299                                                  i.e., it signals overflow    program on the 80387
8300                                                  only when the rounding       produces under overflow
8301                                                  control is not set to round  conditions a result that is
8302                                                  to zero .If rounding is set  different in the least
8303                                                  to chop (toward zero), the   significant bit of the
8304                                                  result is positive or        significand, compared to
8305                                                  negative infinity.           the result on the 80287.
8306
8307                  Overflow exception not          Overflow exception not       Overflow exception not
8308                  masked.                         masked.                      masked.
8309                  The precision exception is      The precision exception is   If the result is stored on
8310                  flagged. When the result is     not flagged and the          the stack, a program on
8311                  stored in the stack, the        significand is not rounded.  the 80387 produces a
8312                  significand is rounded                                       different result under
8313                  according to the precision                                   overflow conditions than
8314                  control (PC) bit of the                                      on the 80287/8087. The
8315                  control word of according                                    difference is apparent only
8316                  to the opcode.                                               to the exception handler.
8317
8318 Underflow        Conditions for underflow.       Conditions for underflow.    Underflow exception             IEEE Standard 754
8319 Exception        When the underflow              When the underflow           masked.                         compatibility.
8320                  exception is masked, the        exception is masked and      No impact. The underflow
8321 Two related      underflow exception is          rounding is toward zero, the exception occurs less
8322 events           signaled when both the          underflow exception flag is  often when rounding is
8323 contribute to    result is tiny and              raised on tininess,          toward zero.
8324 underflow:       denormalization results         regardless of loss of
8325                  in a loss of accuracy.          accuracy.
8326 1. The creation
8327    tiny result.  Response to underflow.          Response to underflow.       Underflow exception not
8328    A tiny        When the underflow              When the underflow           masked.
8329    number,       exception is unmasked           exception is not masked and  A program on the 80387
8330    because it    and the instruction is          the destination is the       produces a different result
8331    is so small,  supposed to store the           stack, the significand is    during underflow
8332    may cause     result on the stack, the        not rounded but rather is    conditions than on the 80287/
8333    some other    significand is rounded to       left as is.                  8087 if the result is
8334    exception     the appropriate precision                                    stored on the stack. The
8335    later (such   (according to the precision                                  difference is only in the
8336    as overflow   control (PC) bit of the                                      least significant bit of the
8337    upon          control word, for those                                      significand and is apparent
8338    division).    instructions controlled by                                   only to the exception handler.
8339                  PC, otherwise to extended
8340  2. Loss of      precision).
8341    accuracy
8342    during the
8343    denormaliza-
8344    tion of a
8345    tiny number.
8346    Which of
8347    these events
8348    triggers the
8349    underflow
8350    exception
8351    depends on
8352    whether the
8353    underflow
8354    exception
8355    is masked.
8356
8357 Which of these
8358 events triggers
8359 the underflow
8360 exception
8361 depends on
8362 whether the
8363 underflow
8364 exception is
8365 masked.
8366
8367 Exception       There is no difference in       When the denormal             None, but some unneeded         Operational improvement.
8368 Precedence      the precedence of the           exception is not masked,      normalization of denormal
8369                 denormal exception,             it takes precedence over      operands is prevented on
8370                 whether it be masked or         all other exceptions.         the 80387.
8371                 not.
8372
8373 C.3  TAG, STATUS, AND CONTROL WORDS
8374
8375 Bits C3-C0 of   After FINIT, incomplete         After FINIT, incomplete       None.                           Upgrade, to provide
8376 Status Word     FPREM, and hardware             FPREM, and hardware                                           consistent state after reset.
8377                 reset, the 80387 sets these     reset, the 80287/8087
8378                 bits to zero.                   leaves these bits intact
8379                                                 (they contain the prior
8380                                                 value).
8381
8382 Bit C2 of       Bit 10 (C2) serves as an        This bit is undefined for     None. Programs don't            Upgrade to allow fast
8383 Status Word     incomplete bit for FPTAN.       FPTAN.                        check C2 after FPTAN.           checking of operand range.
8384
8385
8386 Infinity        Only affine closure is          Both affine and projective    Software that requires          IEEE Standard 754
8387 Control         supported. Bit 12 remains       closures are supported.       projective infinity             compatibility.
8388                 programmable but has no         After RESET, the default      arithmetic may give
8389                 effect on 80387 operation.      value in the control word is  different results.
8390                                                 projective.
8391
8392 Status Word     When an invalid-operation       When an invalid-operation     None. Existing exception        Upgrade and performance
8393 Bit 6 for       exception occurs due to         exception occurs due to       handlers need not change,       improvement.
8394 Stack Fault     stack overflow or               stack overflow or underflow,  but may be upgraded to
8395                 underflow, not only is bit 0    only bit 0 (IE) of the        take advantage of the
8396                 (IE) the status word set, but   status word is set. Bit 6 is  additional information.
8397                 also bit 6 is set to indicate   RESERVED.                     Newly written handlers will
8398                 a stack fault and bit 9 (C1)                                  be more effective.
8399                 specifies overflow or
8400                 underflow. Bit 6 is called
8401                 SF and serves to distinguish
8402                 invalid exceptions caused by
8403                 stack overflow/underflow from
8404                 those caused by numeric
8405                 operations.
8406
8407 Tag Word        When loading the tag word       The corresponding tag is      Software may not operate        Performance improvement.
8408                 with an FLDENV or               checked before each           correctly if it uses FLDENV
8409                 FRSTOR instruction, the         register access to determine  or FRSTOR to change tags
8410                 only interpretations of tag     the class of operand in the   to values (other than
8411                 values used by the 80387        register; the tag is updated  empty) that are different
8412                 are empty (value 11) and        after every change to a       from actual register
8413                 Nonempty (values 00, 01,        register so that the tag      contents.
8414                 and 10). Subsequent             always reflects the most
8415                 operations on a nonempty        recent status of the
8416                 register always examine         register. Programmers can
8417                 the value in the register,      load a tag with a value that
8418                 not the value in its tag.       disagrees with the contents
8419                 The FSTENV and FSAVE            of a register (for example,
8420                 instructions examine the        the register contains valid
8421                 nonempty registers and          contents, but the tag says
8422                 put the correct values in       special; the 80287/8087, in
8423                 the tags before storing the     this case, honors the tag
8424                 tag word.                       and does not examine the
8425                                                 register).
8426
8427 C.4  INSTRUCTION SET
8428
8429 FBSTP, FDIV,    Operation on denormal           Operation on denormal         The exception handler for       IEEE Standard 754
8430 FIST(P), FPREM, operand is supported. An        operand raises                underflow may require           compatibility.
8431 FSQRT           underflow exception can         invalid-operation exception.  change only if it gives
8432                 occur.                          Underflow is not possible.    different treatment to
8433                                                                               different opcodes.  Possibly
8434                                                                               fewer invalid-operation
8435                                                                               exceptions will occur.
8436
8437 FSCALE          The range of the scaling        The range of the scaling      Different result when           Upgrade.
8438                 operand is not restricted.      operand is retricted. If 0 <  0 < �ST(1)� < 1.
8439                 If 0 < �ST(1)� < 1, the         �ST(1)� < 1, the result is
8440                 scaling factor is zero;         undefined and no exception
8441                 therefore, ST(0) remains        is signaled.
8442                 unchanged. If the rounded
8443                 result is not exact or if
8444                 there was a loss of
8445                 accuracy (masked underflow),
8446                 the precision exception
8447                 is signaled.
8448
8449 FPREM1          Performs partial remainder      Does not exist.               None.                           IEEE Standard 754
8450                 according to IEEE                                                                             compatibility and upgrade.
8451                 Standard 754 standard.
8452
8453 FPREM           Bits C0, C3, C1 of the          The quotient bits are         None. Software that works       Upgrade.
8454                 status word, correctly          incorrect when performing a   around the bug should not
8455                 reflect the three low-order     reduction of 64^(N) + M when  be affected.
8456                 bits of the quotient.           N � 1 and M=1 or M=2.
8457
8458
8459 FUCOM, FUCOMP,  Perform unordered               Do not exist.                 None.                           IEEE Standard 754
8460 FUCOMPP         compare according to                                                                          compatibility.
8461                 IEEE Standard 754
8462                 standard.
8463
8464 FPTAN           Range of operand is much        Range of operand is           None.                           Upgrade.
8465                 less restricted (�ST(0)� <      restricted (�ST(0)� < Ò/4);
8466                 2^(63)); reduces operand        operand must be reduced
8467                 internally using an internal    to range using FPREM.
8468                 Ò/4 constant that is more
8469                 accurate.
8470
8471                 After a stack overflow          After a stack overflow                                        IEEE Standard 754
8472                 when the invalid-operation      when the invalid-operation                                    compatibility.
8473                 exception is masked, both       exception is masked, the
8474                 ST and ST(1) contain quiet      original operand remains
8475                 NaNs.                           unchanged, but is pushed
8476                                                 to ST(1).
8477
8478 FSIN, FCOS,     Perform three common            Do not exist.                 None.                           Upgrade.
8479 FSINCOS         trigonometric functions.
8480
8481 FPATAN          Range of operands is            �ST(0)� must be smaller       None.                           Upgrade.
8482                 unrestricted.                   than �ST(1)�.
8483
8484 F2XM1           Wider range of operand          The supported operand         None.                           Upgrade.
8485                 (-1 ¾ ST(0) ¾ +1).              range is 0 ¾ ST(0) ¾ 0.5.
8486
8487 FLD             Does not report denormal        Reports denormal exception.   None.                           Upgrade.
8488 extended-real   exception because the
8489                 instruction is not arithmetic.
8490
8491 FXTRACT         If the operand is zero, the     If the operand is zero,       None. Software usually          IEEE 754 recommendation
8492                 zero-divide exception is        ST(1) is zero and no          bypasses zero and ý.            to fully support the logb
8493                 reported and ST(1) is -ý.       exception is reported. If                                     function.
8494                 If the operand is +ý, no        the operand is +ý, the
8495                 exception is reported.          invalid-operation exception
8496                                                 is reported.
8497
8498 FLD constant    Rounding control is in          Rounding control is not in    Results are the same as         IEEE 754 recommendation.
8499                 effect.                         effect.                       for the 8087/80287 when
8500                                                                               rounding control is set to
8501                                                                               round to zero, round to
8502                                                                               -ý, and (in the case of
8503                                                                               FLDL2T) round to nearest.
8504                                                                               Results are different by
8505                                                                               one in the least significant
8506                                                                               bit of the significand in
8507                                                                               round to +ý and round to
8508                                                                               nearest (excluding
8509                                                                               FLDL2T). FLD1 and FLDZ
8510                                                                               are always the same.
8511
8512 FLD             Loading a denormal              Loading a denormal causes     If the next instruction is      IEEE Standard 754
8513 single/double   causes the number to be         the number to be converted    FXTRACT or FXAM, the            compatibility.
8514 precision       converted to extended           to an unnormal.               80387 will give a different
8515                 precision (because it is put                                  result than the 80287/8087.
8516                 on the stack).
8517
8518 FLD             When loading a signaling        Does not raise an             The exception handler           IEEE Standard 754
8519 single/double   NaN, raises invalid exception.  exception when loading a      need to be updated to           compatibility.
8520 precision                                       signaling NaN.                handle this condition.
8521
8522 FSETPM          Treated as FNOP (no             Informs the 80287 that the    None.                           The 80386 handles all
8523                 operation).                     system is in protected                                        addressing and
8524                                                 mode.                                                         exception-pointer information,
8525                                                                                                               whether in protected mode
8526                                                                                                               or not.
8527
8528 FXAM            When encountering an            May generate these            None.                           Upgrade, to provide
8529                 empty register, the 80387       combinations, among others.                                   repeatable results.
8530                 will not generate
8531                 combinations of C3-C0 equal to
8532                 1101 or 1111.
8533
8534 All             May generate different          Round-up bit of status        None.                           Upgrade, to signal
8535 Transcendental  results in round-up bit of      word is undefined for these                                   rounding status.
8536 Instructions    status word.                    instructions.
8537
8538
8539 Appendix D  Compatibility Between the 80387 and the 8087
8540
8541 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8542
8543 The 80386/80387 operating in real-address mode will execute 8087 programs
8544 without major modification. However, because of differences in the handling
8545 of numeric exceptions between the 80387 NPX and the 8087 NPX,
8546 exception-handling routines may need to be changed.
8547
8548 This appendix summarizes the additional differences between the 80387 NPX
8549 and the 8087 NPX (other than those already included in Appendix B), and
8550 provides details showing how 8087 programs can be ported to the 80387.
8551
8552   1.  The 80387 signals exceptions through a dedicated ERROR# line to
8553       the 80386; no interrupt controller is needed for this purpose. The
8554       8087 requires an interrupt controller (8259A) to interrupt the CPU
8555       when an unmasked exception occurs. Therefore, any
8556       interrupt-controller-oriented instructions in numeric exception
8557       handlers for the 8087 should be deleted.
8558
8559   2.  The 8087 instructions FENI/FNENI and FDISI/FNDISI perform no useful
8560       function in the 80387. If the 80387 encounters one of these opcodes in
8561       its instruction stream, the instruction will effectively be
8562       ignored‘‘none of the 80387 internal states will be updated. While 8087
8563       code containing these instructions may be executed on the 80387, it
8564       is unlikely that the exception-handling routines containing these
8565       instructions will be completely portable to the 80387.
8566
8567   3.  In real mode and protected mode (not including virtual 8086 mode),
8568       interrupt vector 16 must point to the numeric exception handling
8569       routine. In virtual 8086 mode, the V86 monitor can be programmed to
8570       accommodate a different location of the interrupt vector for numeric
8571       exceptions.
8572
8573   4.  The ESC instruction address saved in the 80386/80387 or 80386/80287
8574       includes any leading prefixes before the ESC opcode. The corresponding
8575       address saved in the 8086/8087 does not include leading prefixes.
8576
8577   5.  In protected mode (not including virtual 8086 mode), the format of
8578       the 80387's saved instruction and address pointers is different than
8579       for the 8087. The instruction opcode is not saved in protected
8580       mode‘‘exception handlers will have to retrieve the opcode from memory
8581       if needed.
8582
8583   6.  Interrupt 7 will occur in the 80386 when executing ESC instructions
8584       with either TS (task switched) or EM (emulation) of the 80386 MSW set
8585       (TS=1 or EM=1). If TS is set, then a WAIT instruction will also cause
8586       interrupt 7. An exception handler should be included in 80387 code to
8587       handle these situations.
8588
8589   7.  Interrupt 9 will occur if the second or subsequent words of a
8590       floating-point operand fall outside a segment's size. Interrupt 13
8591       will occur if the starting address of a numeric operand falls outside
8592       a segment's size. An exception handler should be included to report
8593       these programming errors.
8594
8595   8.  Except for the processor control instructions, all of the 80387
8596       numeric instructions are automatically synchronized by the 80386
8597       CPU‘‘the 80386 automatically waits until all operands have been
8598       transferred between the 80386 and the 80387 before executing the
8599       next ESC instruction. No explicit WAIT instructions are required to
8600       assure this synchronization. For the 8087 used with 8086 and 8088
8601       processors, explicit WAITs are required before each numeric
8602       instruction to ensure synchronization. Although 8087 programs having
8603       explicit WAIT instructions will execute perfectly on the 80387
8604       without reassembly, these WAIT instructions are unnecessary.
8605
8606   9.  Since the 80387 does not require WAIT instructions before each
8607       numeric instruction, the ASM386 assembler does not automatically
8608       generate these WAIT instructions. The ASM86 assembler, however,
8609       automatically precedes every ESC instruction with a WAIT
8610       instruction. Although numeric routines generated using the ASM86
8611       assembler will generally execute correctly on the 80386/20,
8612       reassembly using ASM386 may result in a more compact code image and
8613       faster execution.
8614
8615       The processor control instructions for the 80387 may be coded using
8616       either a WAIT or No-WAIT form of mnemonic. The WAIT forms of these
8617       instructions cause ASM386 to precede the ESC instruction with a CPU
8618       WAIT instruction, in the identical manner as does ASM86.
8619
8620   10. The address of a memory operand stored by FSAVE or FSTENV is
8621       undefined if the previous ESC instruction did not refer to memory.
8622
8623   11. Because the 80387 automatically normalizes denormal numbers when
8624       possible, an 8087 program that uses the denormal exception solely to
8625       normalize denormal operands can run on an 80387 by masking the
8626       denormal exception. The 8087 denormal exception handler would not be
8627       used by the 80387 in this case. A numerics program runs faster when
8628       the 80387 performs normalization of denormal operands. A program can
8629       detect at run-time whether it is running on an 80387 or 8087/80287 and
8630       disable the denormal exception when an 80387 is used.
8631
8632
8633 Appendix E  80387 80-Bit CHMOS III Numeric Processor Extension
8634
8635 For Advance Information on the Intel 80387 please consult Appendix E of the
8636 printed version of this book or the 80387 Data Sheet, order number 231920.
8637
8638
8639 Appendix F  PC/AT-Compatible 80387 Connection
8640
8641 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8642
8643 The PC/AT uses a nonstandard scheme to report 80287 exceptions to the
8644 80286. When replicating the PC/AT coprocessor interface in 80386-based
8645 systems, the PC/AT interface cannot be used in exactly the same way;
8646 however, this appendix outlines a similar interface that works on
8647 80386/80387 systems and maintains compatibility with the nonstandard PC/AT
8648 scheme.
8649
8650 Note that the interface outlined here does not represent a new interface
8651 standard; it needs to be incorporated in AT-compatible designs only because
8652 the 80286 and 80287 in the PC/AT are not connected according to the
8653 standards defined by Intel. The standard 80386/80387 connection recommended
8654 by Intel in the 80387 Data Sheet functions properly; the 80386
8655 implementation has not been and will not be altered.
8656
8657
8658 F.1  The PC/AT Interface
8659
8660 In the PC/AT, the ERROR# input to the 80286 is tied inactive (high)
8661 permanently. The ERROR# output of the 80287 is tied to an interrupt port
8662 (IRQ13). This interrupt replaces exception signaling via the 80286's ERROR#
8663 input. To guarantee (in the case of an 80287 exception) that INTR 13 will be
8664 serviced prior to the execution of any further 80287 instructions, an
8665 edge-triggered flip-flop latches BUSY# using ERROR# as a clock. The output
8666 of this latch is ORed with the BUSY# output of the 80287 and drives the
8667 BUSY# input of the 80286. This PC/AT scheme effectively delays deactivation
8668 of BUSY# at the 80286 whenever an 80287 ERROR# is signaled.
8669
8670 Since the 80286 BUSY# input remains active after an exception, the 80286
8671 interrupt 13 handler is guaranteed to execute before any other 80287
8672 instructions may begin. The interrupt 13 handler clears the BUSY# latch (via
8673 a write to a special I/O port), thus allowing execution of 80287
8674 instructions to proceed. The interrupt 13 handler then branches to the NMI
8675 handler, where the user-defined numerics exception handler resides in
8676 PC-compatible systems.
8677
8678 The use of an interrupt guarantees that an exception from a coprocessor
8679 instruction will be detected. Latching BUSY# guarantees that any coprocessor
8680 instruction (except FINIT, FSETPM, and FCLEX) following the instruction that
8681 raised the exception will not be executed before the NMI handler is
8682 executed.
8683
8684 This PC/AT scheme approximates the exception reporting scheme between the
8685 8087 and 8088 in the original PC.
8686
8687
8688 F.2  How to Achieve the Same Effect in an 80386 System
8689
8690 The 80386 can use a PC/AT-compatible interface to communicate with an 80387
8691 provided that, when an NPX exception occurs, BUSY# active time is extended
8692 and PEREQ is reactivated only after 80387 BUSY# has gone inactive. The 80387
8693 is left active (tying STEN high) at all times. Also, the 80386 and 80387
8694 must be reset by the same RESET signal.
8695
8696 The reactivation of PEREQ for the 80386 is needed for store instructions
8697 (for example, FST mem) because the 80387 drops PEREQ once it signals an
8698 exception. While the 80386 has not yet recognized the occurrence of the
8699 exception, it still expects the data transfers to complete via PEREQ
8700 reactivation. It is permissible for the 80386 to receive undefined data
8701 during such I/O read cycles. Disabling the 80387 is not necessary, because
8702 the dummy data-transfer cycles directed to the 80387 when PEREQ is
8703 externally reactivated for the 80386 will not disturb the operation of the
8704 80387. The interrupt 13 handler should remove the extension of BUSY# and
8705 reactivation of PEREQ via a write to PC/AT-compatible hardware at I/O port
8706 F0H.
8707
8708
8709 Glossary of 80387 and Floating-Point Terminology
8710
8711 ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘
8712
8713 This glossary defines many terms that have precise technical meanings as
8714 specified in the IEEE 754 Standard or as specified in this manual. Where
8715 these terms are used, they have been italicized to emphasize the precision
8716 of their meanings.
8717
8718 Base
8719   (1) a term used in logarithms and exponentials. In both contexts, it is a
8720   number that is being raised to a power. The two equations (y = log base b
8721   of x) and (b^(y) = x) are the same.
8722
8723 Base
8724   (2) a number that defines the representation being used for a string of
8725   digits. Base 2 is the binary representation; base 10 is the decimal
8726   representation; base 16 is the hexadecimal representation. In each case,
8727   the base is the factor of increased significance for each succeeding
8728   digit (working up from the bottom).
8729
8730 Bias
8731   a constant that is added to the true exponent of a real number to obtain
8732   the exponent field of that number's floating-point representation in the
8733   80387. To obtain the true exponent, you must subtract the bias from the
8734   given exponent. For example, the single real format has a bias of 127
8735   whenever the given exponent is nonzero. If the 8-bit exponent field
8736   contains 10000011, which is 131, the true exponent is 131-127, or +4.
8737
8738 Biased Exponent
8739   the exponent as it appears in a floating-point representation of a number.
8740   The biased exponent is interpreted as an unsigned, positive number. In the
8741   above example, 131 is the biased exponent.
8742
8743 Binary Coded Decimal
8744   a method of storing numbers that retains a base 10 representation. Each
8745   decimal digit occupies 4 full bits (one hexadecimal digit). The
8746   hexadecimal values A through F (1010 through 1111) are not used. The
8747   80387 supports a packed decimal format that consists of 9 bytes of binary
8748   coded decimal (18 decimal digits) and one sign byte.
8749
8750 Binary Point
8751   an entity just like a decimal point, except that it exists in binary
8752   numbers. Each binary digit to the right of the binary point is multiplied
8753   by an increasing negative power of two.
8754
8755 C3‘‘C0
8756   the four "condition code" bits of the 80387 status word. These bits are
8757   set to certain values by the compare, test, examine, and remainder
8758   functions of the 80387.
8759
8760 Characteristic
8761   a term used for some non-Intel computers, meaning the exponent field of a
8762   floating-point number.
8763
8764 Chop
8765   to set one or more low-order bits of a real number to zero, yielding the
8766   nearest representable number in the direction of zero.
8767
8768 Condition Code
8769   the four bits of the 80387 status word that indicate the results of the
8770   compare, test, examine, and remainder functions of the 80387.
8771
8772 Control Word
8773   a 16-bit 80387 register that the user can set, to determine the modes of
8774   computation the 80387 will use and the exception interrupts that will be
8775   enabled.
8776
8777 Denormal
8778   a special form of floating-point number. On the 80387, a denormal is
8779   defined as a number that has a biased exponent of zero. By providing a
8780   significand with leading zeros, the range of possible negative
8781   exponents can be extended by the number of bits in the significand.
8782   Each leading zero is a bit of lost accuracy, so the extended exponent
8783   range is obtained by reducing significance.
8784
8785 Double Extended
8786   the Standard's term for the 80387's extended format, with more exponent
8787   and significand bits than the double format and an explicit integer bit
8788   in the significand.
8789
8790 Double Format
8791   a floating-point format supported by the 80387 that consists of a sign, an
8792   11-bit biased exponent, an implicit integer bit, and a 52-bit
8793   significand‘‘a total of 64 explicit bits.
8794
8795 Environment
8796   the 14 or 28 (depending on addressing mode) bytes of 80387 registers
8797   affected by the FSTENV and FLDENV instructions. It encompasses the entire
8798   state of the 80387, except for the 8 registers of the 80387 stack.
8799   Included are the control word, status word, tag word, and the instruction,
8800   opcode, and operand information provided by interrupts.
8801
8802 Exception
8803   any of the six conditions (invalid operand, denormal, numeric overflow,
8804   numeric underflow, zero-divide, and precision) detected by the 80387 that
8805   may be signaled by status flags or by traps.
8806
8807 Exception Pointers
8808   The data maintained by the 80386 to help exception handlers identify
8809   the cause of an exception. This data consists of a pointer to the most
8810   recently executed ESC instruction and a pointer to the memory operand of
8811   this instruction, if it had a memory operand. An exception handler can use
8812   the FSTENV and FSAVE instructions to access these pointers.
8813
8814 Exponent
8815   (1) any number that indicates the power to which another number is raised.
8816
8817 Exponent
8818   (2) the field of a floating-point number that indicates the magnitude of
8819   the number. This would fall under the above more general definition (1),
8820   except that a bias sometimes needs to be subtracted to obtain the correct
8821   power.
8822
8823 Extended Format
8824   the 80387's implementation of the Standard's double extended format.
8825   Extended format is the main floating-point format used by the 80387.
8826   It consists of a sign, a 15-bit biased exponent, and a significand with an
8827   explicit integer bit and 63 fractional-part bits.
8828
8829 Floating-Point
8830   of or pertaining to a number that is expressed as base, a sign, a
8831   significand, and a signed exponent. The value of the number is the signed
8832   product of its significand and the base raised to the power of the
8833   exponent. Floating-point representations are more versatile than integer
8834   representations in two ways. First, they include fractions. Second, their
8835   exponent parts allow a much wider range of magnitude than possible with
8836   fixed-length integer representations.
8837
8838 Gradual Underflow
8839   a method of handling the underflow error condition that minimizes the loss
8840   of accuracy in the result. If there is a denormal number that represents
8841   the correct result, that denormal is returned. Thus, digits are lost only
8842   to the extent of denormalization. Most computers return zero when
8843   underflow occurs, losing all significant digits.
8844
8845 Implicit Integer Bit
8846   a part of the significand in the single real and double real formats
8847   that is not explicitly given. In these formats, the entire given
8848   significand is considered to be to the right of the binary point. A single
8849   implicit integer bit to the left of the binary point is always one, except
8850   in one case. When the exponent is the minimum (biased exponent is zero),
8851   the implicit integer bit is zero.
8852
8853 Indefinite
8854   a special value that is returned by functions when the inputs are such
8855   that no other sensible answer is possible. For each floating-point format
8856   there exists one quiet NaN that is designated as the indefinite value. For
8857   binary integer formats, the negative number furthest from zero is often
8858   considered the indefinite value. For the 80387 packed decimal format, the
8859   indefinite value contains all 1's in the sign byte and the uppermost
8860   digits byte.
8861
8862 Inexact
8863   The Standard's term for the 80387's precision exception.
8864
8865 Infinity
8866   a value that has greater magnitude than any integer or any real number. It
8867   is often useful to consider infinity as another number, subject to special
8868   rules of arithmetic. All three Intel floating-point formats provide
8869   representations for +ý and -ý.
8870
8871 Integer
8872   a number (positive, negative, or zero) that is finite and has no
8873   fractional part. Integer can also mean the computer representation for
8874   such a number: a sequence of data bytes, interpreted in a standard way. It
8875   is perfectly reasonable for integers to be represented in a floating-point
8876   format; this is what the 80387 does whenever an integer is pushed onto the
8877   80387 stack.
8878
8879 Integer Bit
8880   a part of the significand in floating-point formats. In these formats, the
8881   integer bit is the only part of the significand considered to be to the
8882   left of the binary point. The integer bit is always one, except in one
8883   case: when the exponent is the minimum (biased exponent is zero), the
8884   integer bit is zero. In the extended format the integer bit is explicit;
8885   in the single format and double format the integer bit is implicit; i.e.,
8886   it is not actually stored in memory.
8887
8888 Invalid Operation
8889   the exception condition for the 80387 that covers all cases not covered by
8890   other exceptions. Included are 80387 stack overflow and underflow, NaN
8891   inputs, illegal infinite inputs, out-of-range inputs, and inputs in
8892   unsupported formats.
8893
8894 Long Integer
8895   an integer format supported by the 80387 that consists of a 64-bit two's
8896   complement quantity.
8897
8898 Long Real
8899   an older term for the 80387's 64-bit double format.
8900
8901 Mantissa
8902   a term used with some non-Intel computers for the significand of a
8903   floating-point number.
8904
8905 Masked
8906   a term that applies to each of the six 80387 exceptions I,D,Z,O,U,P. An
8907   exception is masked if a corresponding bit in the 80387 control word is
8908   set to one. If an exception is masked, the 80387 will not generate an
8909   interrupt when the exception condition occurs; it will instead provide its
8910   own exception recovery.
8911
8912 Mode
8913   One of the status word fields "rounding control" and "precision control"
8914   which programs can set, sense, save, and restore to control the execution
8915   of subsequent arithmetic operations.
8916
8917 NaN
8918   an abbreviation for "Not a Number"; a floating-point quantity that does
8919   not represent any numeric or infinite quantity. NaNs should be returned
8920   by functions that encounter serious errors. If created during a sequence
8921   of calculations, they are transmitted to the final answer and can contain
8922   information about where the error occurred.
8923
8924 Normal
8925   the representation of a number in a floating-point format in which the
8926   significand has an integer bit one (either explicit or implicit).
8927
8928 Normalize
8929   convert a denormal representation of a number to a normal representation.
8930
8931 NPX
8932   Numeric Processor Extension. This is the 80387, 80287, or 8087.
8933
8934 Overflow
8935   an exception condition in which the correct answer is finite, but has
8936   magnitude too great to be represented in the destination format. This kind
8937   of overflow (also called numeric overflow) is not to be confused with
8938   stack overflow.
8939
8940 Packed Decimal
8941   an integer format supported by the 80387. A packed decimal number is a
8942   10-byte quantity, with nine bytes of 18 binary coded decimal digits and
8943   one byte for the sign.
8944
8945 Pop
8946   to remove from a stack the last item that was placed on the stack.
8947
8948 Precision
8949   The effective number of bits in the significand of the floating-point
8950   representation of a number.
8951
8952 Precision Control
8953   an option, programmed through the 80387 control word, that allows all
8954   80387 arithmetic to be performed with reduced precision. Because no
8955   speed advantage results from this option, its only use is for strict
8956   compatibility with the standard and with other computer systems.
8957
8958 Precision Exception
8959   an 80387 exception condition that results when a calculation does not
8960   return an exact answer. This exception is usually masked and ignored; it
8961   is used only in extremely critical applications, when the user must know
8962   if the results are exact. The precision exception is called inexact
8963   in the standard.
8964
8965 Pseudozero
8966   one of a set of special values of the extended real format. The set
8967   consists of numbers with a zero significand and an exponent that is
8968   neither all zeros nor all ones. Pseudozeros are not created by the 80387
8969   but are handled correctly when encountered as operands.
8970
8971 Quiet NaN
8972   a NaN in which the most significant bit of the fractional part of the
8973   significand is one. By convention, these NaNs can undergo certain
8974   operations without causing anexception.
8975
8976 Real
8977   any finite value (negative, positive, or zero) that can be represented by
8978   a (possibly infinite) decimal expansion. Reals can be represented as the
8979   points of a line marked off like a ruler. The term real can also refer
8980   to a floating-point number that represents a real value.
8981
8982 Short Integer
8983   an integer format supported by the 80387 that consists of a 32-bit two's
8984   complement quantity. short integer is not the shortest 80387 integer
8985   format‘‘the 16-bit word integer is.
8986
8987 Short Real
8988   an older term for the 80387's 32-bit single format.
8989
8990 Signaling NaN
8991   a NaN that causes an invalid-operation exception whenever it enters into
8992   a calculation or comparison, even a nonordered comparison.
8993
8994 Significand
8995   the part of a floating-point number that consists of the most significant
8996   nonzero bits of the number, if the number were written out in an unlimited
8997   binary format. The significand is composed of an integer bit and a
8998   fraction. The integer bit is implicit in the single format and double
8999   format. The significand is considered to have a binary point after the
9000   integer bit; the binary point is then moved according to the value of the
9001   exponent.
9002
9003 Single Extended
9004   a floating-point format, required by the standard, that provides greater
9005   precision than single; it also provides an explicit integer bit in the
9006   significand. The 80387's extended format meets the single extended
9007   requirement as well as the double extended requirement.
9008
9009 Single Format
9010   a floating-point format supported by the 80387, which consists of a sign,
9011   an 8-bit biased exponent, an implicit integer bit, and a 23-bit
9012   significand‘‘a total of 32 explicit bits.
9013
9014 Stack Fault
9015   a special case of the invalid-operation exception which is indicated by a
9016   one in the SF bit of the status word. This condition usually results from
9017   stack underflow or overflow.
9018
9019 Standard
9020   "IEEE Standard for Binary Floating-Point Arithmetic," ANSI/IEEE
9021   Std 754-1985.
9022
9023 Status Word
9024   A 16-bit 80387 register that can be manually set, but which is usually
9025   controlled by side effects to 80387 instructions. It contains condition
9026   codes, the 80387 stack pointer, busy and interrupt bits, and exception
9027   flags.
9028
9029 Tag Word
9030   a 16-bit 80387 register that is automatically maintained by the 80387. For
9031   each space in the 80387 stack, it tells if the space is occupied by a
9032   number; if so, it gives information about what kind of number.
9033
9034 Temporary Real
9035   an older term for the 80387's 80-bit extended format.
9036
9037 Tiny
9038   of or pertaining to a floating-point number that is so close to zero that
9039   its exponent is smaller than smallest exponent that can be represented in
9040   the destination format.
9041
9042 TOP
9043   The three-bit field of the status word that indicates which 80387 register
9044   is the current top of stack.
9045
9046 Transcendental
9047   one of a class of functions for which polynomial formulas are always
9048   approximate, never exact for more than isolated values. The 80387 supports
9049   trigonometric, exponential, and logarithmic functions; all are
9050   transcendental.
9051
9052 Two's Complement
9053   a method of representing integers. If the uppermost bit is zero, the
9054   number is considered positive, with the value given by the rest of the
9055   bits. If the uppermost bit is one, the number is negative, with the value
9056   obtained by subtracting (2^(bit count)) from all the given bits. For
9057   example, the 8-bit number 11111100 is -4, obtained by subtracting 2^(8)
9058   from 252.
9059
9060 Unbiased Exponent
9061   the true value that tells how far and in which direction to move the
9062   binary point of the significand of a floating-point number. For example,
9063   if a single-format exponent is 131, we subtract the Bias 127 to obtain the
9064   unbiased exponent +4. Thus, the real number being represented is the
9065   significand with the binary point shifted 4 bits to the right.
9066
9067 Underflow
9068   an exception condition in which the correct answer is nonzero, but has a
9069   magnitude too small to be represented as a normal number in the
9070   destination floating-point format. The Standard specifies that an attempt
9071   be made to represent the number as a denormal. This denormalization may
9072   result in a loss of significant bits from the significand. This kind of
9073   underflow (also called numeric overflow) is not to be confused with stack
9074   underflow.
9075
9076 Unmasked
9077   a term that applies to each of the six 80387 exceptions: I,D,Z,O,U,P. An
9078   exception is unmasked if a corresponding bit in the 80387 control word is
9079   set to zero. If an exception is unmasked, the 80387 will generate an
9080   interrupt when the exception condition occurs. You can provide an
9081   interrupt routine that customizes your exception recovery.
9082
9083 Unnormal
9084   a extended real representation in which the explicit integer bit of the
9085   significand is zero and the exponent is nonzero. Unnormal values are
9086   not supported by the 80387; they cause the invalid-operation exception
9087   when encountered as operands.
9088
9089 Unsupported Format
9090   Any number representation that is not recognized by the 80387. This
9091   includes several formats that are recognized by the 8087 and 80287;
9092   namely: pseudo-NaN, pseudoinfinity, and unnormal.
9093
9094 Word Integer
9095   an integer format supported by both the 80386 and the 80387 that consists
9096   of a 16-bit two's complement quantity.
9097
9098 Zero divide
9099   an exception condition in which the inputs are finite, but the correct
9100   answer, even with an unlimited exponent, has infinite magnitude.